Parse ruleset with `not(.asd, .ass)` #128

stoivo · 2021-09-02T08:53:12Z

I have a css rule with :not(.asd, .ass) and it becomes separated into two different rule_sets. I worked around it in my project by removing the space, :not(.asd, .ass) => :not(.asd,.ass).

I have traced the issue to lib/css_parser/parser.rb#L329.

css_parser/lib/css_parser/parser.rb

Line 329 in 2b0e89c

block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token|

.

Also included a script to reproduce.

require 'css_parser'
parser = CssParser::Parser.new
parser.load_string! "a:not(.asd, .ass) { color: hotpink; }"

parser.each_rule_set do |rule_set, media_types|
  puts "rule_set #{rule_set}"
  rule_set.selectors.each do |selector_string|
    puts "  selector_string #{selector_string}"
  end
end

# output:
# rule_set a:not(.asd,.ass) { color: hotpink; }
#   selector_string a:not(.asd
#   selector_string .ass)

The text was updated successfully, but these errors were encountered:

grosser · 2021-09-03T04:10:59Z

PR welcome as long as it does not break other tests :)

stoivo · 2021-09-03T08:37:11Z

ok, I will see if I can come up with something. I'm a bit afraid to change it since it's soo core to the code

Can I be confident that if the spec passes everything is good?

grosser · 2021-09-03T14:11:56Z

specs are pretty good afaik ... but ideally break the area on purpose first and see what fails :)

…

On Fri, Sep 3, 2021 at 1:37 AM Simon ***@***.***> wrote: ok, I will see if I can come up with something. Can I be confident that if the spec pass everything is good? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAACYZ5C2AVR2NFEMQDNOMTUACCMDANCNFSM5DISDB6A> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

stoivo · 2021-10-17T20:02:04Z

I have been looking a bit around. I would like to rewrite the parser. Are you open for a merge request which replaces

css_parser/lib/css_parser/parser.rb

Lines 307 to 424 in 2b0e89c

    
           def parse_block_into_rule_sets!(block, options = {}) # :nodoc: 
        
             current_media_queries = [:all] 
        
             if options[:media_types] 
        
               current_media_queries = options[:media_types].flatten.collect { |mt| CssParser.sanitize_media_query(mt) } 
        
             end 
        
             in_declarations = 0 
        
             block_depth = 0 
        
             in_charset = false # @charset is ignored for now 
        
             in_string = false 
        
             in_at_media_rule = false 
        
             in_media_block = false 
        
             current_selectors = String.new 
        
             current_media_query = String.new 
        
             current_declarations = String.new 
        
             # once we are in a rule, we will use this to store where we started if we are capturing offsets 
        
             rule_start = nil 
        
             offset = nil 
        
             block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token| 
        
               # save the regex offset so that we know where in the file we are 
        
               offset = Regexp.last_match.offset(0) if options[:capture_offsets] 
        
               if token.start_with?('"') # found un-escaped double quote 
        
                 in_string = !in_string 
        
               end 
        
               if in_declarations > 0 
        
                 # too deep, malformed declaration block 
        
                 if in_declarations > 1 
        
                   in_declarations -= 1 if token.include?('}') 
        
                   next 
        
                 end 
        
                 if !in_string && token.include?('{') 
        
                   in_declarations += 1 
        
                   next 
        
                 end 
        
                 current_declarations << token 
        
                 if !in_string && token.include?('}') 
        
                   current_declarations.gsub!(/\}\s*$/, '') 
        
                   in_declarations -= 1 
        
                   current_declarations.strip! 
        
                   unless current_declarations.empty? 
        
                     if options[:capture_offsets] 
        
                       add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries) 
        
                     else 
        
                       add_rule!(current_selectors, current_declarations, current_media_queries) 
        
                     end 
        
                   end 
        
                   current_selectors = String.new 
        
                   current_declarations = String.new 
        
                   # restart our search for selectors and declarations 
        
                   rule_start = nil if options[:capture_offsets] 
        
                 end 
        
               elsif token =~ /@media/i 
        
                 # found '@media', reset current media_types 
        
                 in_at_media_rule = true 
        
                 current_media_queries = [] 
        
               elsif in_at_media_rule 
        
                 if token.include?('{') 
        
                   block_depth += 1 
        
                   in_at_media_rule = false 
        
                   in_media_block = true 
        
                   current_media_queries << CssParser.sanitize_media_query(current_media_query) 
        
                   current_media_query = String.new 
        
                 elsif token.include?(',') 
        
                   # new media query begins 
        
                   token.tr!(',', ' ') 
        
                   token.strip! 
        
                   current_media_query << token << ' ' 
        
                   current_media_queries << CssParser.sanitize_media_query(current_media_query) 
        
                   current_media_query = String.new 
        
                 else 
        
                   token.strip! 
        
                   current_media_query << token << ' ' 
        
                 end 
        
               elsif in_charset or token =~ /@charset/i 
        
                 # iterate until we are out of the charset declaration 
        
                 in_charset = !token.include?(';') 
        
               elsif !in_string && token.include?('}') 
        
                 block_depth -= 1 
        
                 # reset the current media query scope 
        
                 if in_media_block 
        
                   current_media_queries = [:all] 
        
                   in_media_block = false 
        
                 end 
        
               elsif !in_string && token.include?('{') 
        
                 current_selectors.strip! 
        
                 in_declarations += 1 
        
               else 
        
                 # if we are in a selector, add the token to the current selectors 
        
                 current_selectors << token 
        
                 # mark this as the beginning of the selector unless we have already marked it 
        
                 rule_start = offset.first if options[:capture_offsets] && rule_start.nil? && token =~ /^[^\s]+$/ 
        
               end 
        
             end 
        
             # check for unclosed braces 
        
             return unless in_declarations > 0 
        
             unless options[:capture_offsets] 
        
               return add_rule!(current_selectors, current_declarations, current_media_queries) 
        
             end 
        
             add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries) 
        
           end

? I get it that you need to see the patch before you can say you would approve it, but if it wouldn't be merged how ever good the new parser it

grosser · 2021-10-19T05:42:40Z

if you can somehow keep or lower the complexity and runtime then I'm in favor
(+tests mostly need to keep passing)
I'd not be a fan of tons of added complexity or significant runtime increase

stoivo · 2021-10-19T13:19:59Z

Ok, thanks. I will see what I can do. I will add new spec too.

stoivo · 2024-05-15T09:08:01Z

I put it on hold for a while, had to prioritize some other issues. Now I am back to it.

I found out that nokogiri has an issue related to searching for selectors like button:-moz-focusring. Nokogiri has an open issue to change css selector parser sparklemotion/nokogiri#2560. In this issue they mention two css parsers. I am thinking that css_parser should use one of those two parsers instead of writing our own. I haven't tested either jet. Would it be okey to use one of these two in css_parser. Any preference?

I think the name of this gem css_parser is a but missleading since it loads css sheets from urls, compine css sheets and more.

grosser · 2024-05-15T18:16:07Z

as long as nokogiri supports css selectors it's nice to reuse that
or are you suggesting to replace nokogiri completely ?
... if we are doing all the css parsing and not reusing nokogiri then reusing an existing css parser would be nice,
not sure how much work that would be, also need to keep the apis semi-stable and keep an eye on performance, so sounds like lots of work

renaming the gem is not really an option, would break too much I'd think

stoivo · 2024-05-16T06:22:40Z

Sorry for beening a bit unclear.

I am suggesting to replace parse_block_into_rule_sets

css_parser/lib/css_parser/parser.rb

Lines 307 to 424 in 2b0e89c

    
           def parse_block_into_rule_sets!(block, options = {}) # :nodoc: 
        
             current_media_queries = [:all] 
        
             if options[:media_types] 
        
               current_media_queries = options[:media_types].flatten.collect { |mt| CssParser.sanitize_media_query(mt) } 
        
             end 
        
             in_declarations = 0 
        
             block_depth = 0 
        
             in_charset = false # @charset is ignored for now 
        
             in_string = false 
        
             in_at_media_rule = false 
        
             in_media_block = false 
        
             current_selectors = String.new 
        
             current_media_query = String.new 
        
             current_declarations = String.new 
        
             # once we are in a rule, we will use this to store where we started if we are capturing offsets 
        
             rule_start = nil 
        
             offset = nil 
        
             block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token| 
        
               # save the regex offset so that we know where in the file we are 
        
               offset = Regexp.last_match.offset(0) if options[:capture_offsets] 
        
               if token.start_with?('"') # found un-escaped double quote 
        
                 in_string = !in_string 
        
               end 
        
               if in_declarations > 0 
        
                 # too deep, malformed declaration block 
        
                 if in_declarations > 1 
        
                   in_declarations -= 1 if token.include?('}') 
        
                   next 
        
                 end 
        
                 if !in_string && token.include?('{') 
        
                   in_declarations += 1 
        
                   next 
        
                 end 
        
                 current_declarations << token 
        
                 if !in_string && token.include?('}') 
        
                   current_declarations.gsub!(/\}\s*$/, '') 
        
                   in_declarations -= 1 
        
                   current_declarations.strip! 
        
                   unless current_declarations.empty? 
        
                     if options[:capture_offsets] 
        
                       add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries) 
        
                     else 
        
                       add_rule!(current_selectors, current_declarations, current_media_queries) 
        
                     end 
        
                   end 
        
                   current_selectors = String.new 
        
                   current_declarations = String.new 
        
                   # restart our search for selectors and declarations 
        
                   rule_start = nil if options[:capture_offsets] 
        
                 end 
        
               elsif token =~ /@media/i 
        
                 # found '@media', reset current media_types 
        
                 in_at_media_rule = true 
        
                 current_media_queries = [] 
        
               elsif in_at_media_rule 
        
                 if token.include?('{') 
        
                   block_depth += 1 
        
                   in_at_media_rule = false 
        
                   in_media_block = true 
        
                   current_media_queries << CssParser.sanitize_media_query(current_media_query) 
        
                   current_media_query = String.new 
        
                 elsif token.include?(',') 
        
                   # new media query begins 
        
                   token.tr!(',', ' ') 
        
                   token.strip! 
        
                   current_media_query << token << ' ' 
        
                   current_media_queries << CssParser.sanitize_media_query(current_media_query) 
        
                   current_media_query = String.new 
        
                 else 
        
                   token.strip! 
        
                   current_media_query << token << ' ' 
        
                 end 
        
               elsif in_charset or token =~ /@charset/i 
        
                 # iterate until we are out of the charset declaration 
        
                 in_charset = !token.include?(';') 
        
               elsif !in_string && token.include?('}') 
        
                 block_depth -= 1 
        
                 # reset the current media query scope 
        
                 if in_media_block 
        
                   current_media_queries = [:all] 
        
                   in_media_block = false 
        
                 end 
        
               elsif !in_string && token.include?('{') 
        
                 current_selectors.strip! 
        
                 in_declarations += 1 
        
               else 
        
                 # if we are in a selector, add the token to the current selectors 
        
                 current_selectors << token 
        
                 # mark this as the beginning of the selector unless we have already marked it 
        
                 rule_start = offset.first if options[:capture_offsets] && rule_start.nil? && token =~ /^[^\s]+$/ 
        
               end 
        
             end 
        
             # check for unclosed braces 
        
             return unless in_declarations > 0 
        
             unless options[:capture_offsets] 
        
               return add_rule!(current_selectors, current_declarations, current_media_queries) 
        
             end 
        
             add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries) 
        
           end

with a gem. crass and syntax_tree-css just parses css sheets and give back tokens. We don't have parse the string but we can work with the structure they return. I had the issue that css_parser's parser don't parse not(.asd, .ass) correctly. If we use crass it it is parsed correctly. We still have to use nokogiri to query the document and inline the styles.

Related to the name. I didn't suggest renaming the gem. Move over I just wanted to mention that css_parser's scope is lot bigger then just parsing css style sheets. It filters by media, execute imports, normalizes rules.

grosser · 2024-05-16T07:18:30Z

if you can get a POC of that working and it's not too much change + performance downgrade then I'm not against
yeah the gem does a lot, maybe the readme needs to spell that out more

stoivo · 2024-05-16T07:32:25Z

syntax_tree-css has an issues to parse selector like .a > .b so it's not an option ruby-syntax-tree/syntax_tree-css#25. Going forward with crass

stoivo changed the title ~~Parse ruleset with not(.asd, .ass)not(.asd, .ass)~~ Parse ruleset with not(.asd, .ass) Sep 3, 2021

stoivo mentioned this issue May 16, 2024

Changes to creation of rule set #154

Merged

grosser mentioned this issue May 20, 2024

New CSS features like CSS Nesting and Container Queries #155

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse ruleset with `not(.asd, .ass)` #128

Parse ruleset with `not(.asd, .ass)` #128

stoivo commented Sep 2, 2021

grosser commented Sep 3, 2021

stoivo commented Sep 3, 2021 •

edited

grosser commented Sep 3, 2021 via email

stoivo commented Oct 17, 2021

grosser commented Oct 19, 2021

stoivo commented Oct 19, 2021

stoivo commented May 15, 2024

grosser commented May 15, 2024

stoivo commented May 16, 2024

grosser commented May 16, 2024

stoivo commented May 16, 2024 •

edited

Parse ruleset with not(.asd, .ass) #128

Parse ruleset with not(.asd, .ass) #128

Comments

stoivo commented Sep 2, 2021

grosser commented Sep 3, 2021

stoivo commented Sep 3, 2021 • edited

grosser commented Sep 3, 2021 via email

stoivo commented Oct 17, 2021

grosser commented Oct 19, 2021

stoivo commented Oct 19, 2021

stoivo commented May 15, 2024

grosser commented May 15, 2024

stoivo commented May 16, 2024

grosser commented May 16, 2024

stoivo commented May 16, 2024 • edited

Parse ruleset with `not(.asd, .ass)` #128

Parse ruleset with `not(.asd, .ass)` #128

stoivo commented Sep 3, 2021 •

edited

stoivo commented May 16, 2024 •

edited