Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse ruleset with not(.asd, .ass) #128

Open
stoivo opened this issue Sep 2, 2021 · 11 comments
Open

Parse ruleset with not(.asd, .ass) #128

stoivo opened this issue Sep 2, 2021 · 11 comments

Comments

@stoivo
Copy link
Contributor

stoivo commented Sep 2, 2021

I have a css rule with :not(.asd, .ass) and it becomes separated into two different rule_sets. I worked around it in my project by removing the space, :not(.asd, .ass) => :not(.asd,.ass).

I have traced the issue to lib/css_parser/parser.rb#L329.

block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token|
.

Also included a script to reproduce.

require 'css_parser'
parser = CssParser::Parser.new
parser.load_string! "a:not(.asd, .ass) { color: hotpink; }"

parser.each_rule_set do |rule_set, media_types|
  puts "rule_set #{rule_set}"
  rule_set.selectors.each do |selector_string|
    puts "  selector_string #{selector_string}"
  end
end

# output:
# rule_set a:not(.asd,.ass) { color: hotpink; }
#   selector_string a:not(.asd
#   selector_string .ass)
@grosser
Copy link
Contributor

grosser commented Sep 3, 2021

PR welcome as long as it does not break other tests :)

@stoivo
Copy link
Contributor Author

stoivo commented Sep 3, 2021

ok, I will see if I can come up with something. I'm a bit afraid to change it since it's soo core to the code

Can I be confident that if the spec passes everything is good?

@stoivo stoivo changed the title Parse ruleset with not(.asd, .ass)not(.asd, .ass) Parse ruleset with not(.asd, .ass) Sep 3, 2021
@grosser
Copy link
Contributor

grosser commented Sep 3, 2021 via email

@stoivo
Copy link
Contributor Author

stoivo commented Oct 17, 2021

I have been looking a bit around. I would like to rewrite the parser. Are you open for a merge request which replaces

def parse_block_into_rule_sets!(block, options = {}) # :nodoc:
current_media_queries = [:all]
if options[:media_types]
current_media_queries = options[:media_types].flatten.collect { |mt| CssParser.sanitize_media_query(mt) }
end
in_declarations = 0
block_depth = 0
in_charset = false # @charset is ignored for now
in_string = false
in_at_media_rule = false
in_media_block = false
current_selectors = String.new
current_media_query = String.new
current_declarations = String.new
# once we are in a rule, we will use this to store where we started if we are capturing offsets
rule_start = nil
offset = nil
block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token|
# save the regex offset so that we know where in the file we are
offset = Regexp.last_match.offset(0) if options[:capture_offsets]
if token.start_with?('"') # found un-escaped double quote
in_string = !in_string
end
if in_declarations > 0
# too deep, malformed declaration block
if in_declarations > 1
in_declarations -= 1 if token.include?('}')
next
end
if !in_string && token.include?('{')
in_declarations += 1
next
end
current_declarations << token
if !in_string && token.include?('}')
current_declarations.gsub!(/\}\s*$/, '')
in_declarations -= 1
current_declarations.strip!
unless current_declarations.empty?
if options[:capture_offsets]
add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries)
else
add_rule!(current_selectors, current_declarations, current_media_queries)
end
end
current_selectors = String.new
current_declarations = String.new
# restart our search for selectors and declarations
rule_start = nil if options[:capture_offsets]
end
elsif token =~ /@media/i
# found '@media', reset current media_types
in_at_media_rule = true
current_media_queries = []
elsif in_at_media_rule
if token.include?('{')
block_depth += 1
in_at_media_rule = false
in_media_block = true
current_media_queries << CssParser.sanitize_media_query(current_media_query)
current_media_query = String.new
elsif token.include?(',')
# new media query begins
token.tr!(',', ' ')
token.strip!
current_media_query << token << ' '
current_media_queries << CssParser.sanitize_media_query(current_media_query)
current_media_query = String.new
else
token.strip!
current_media_query << token << ' '
end
elsif in_charset or token =~ /@charset/i
# iterate until we are out of the charset declaration
in_charset = !token.include?(';')
elsif !in_string && token.include?('}')
block_depth -= 1
# reset the current media query scope
if in_media_block
current_media_queries = [:all]
in_media_block = false
end
elsif !in_string && token.include?('{')
current_selectors.strip!
in_declarations += 1
else
# if we are in a selector, add the token to the current selectors
current_selectors << token
# mark this as the beginning of the selector unless we have already marked it
rule_start = offset.first if options[:capture_offsets] && rule_start.nil? && token =~ /^[^\s]+$/
end
end
# check for unclosed braces
return unless in_declarations > 0
unless options[:capture_offsets]
return add_rule!(current_selectors, current_declarations, current_media_queries)
end
add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries)
end
? I get it that you need to see the patch before you can say you would approve it, but if it wouldn't be merged how ever good the new parser it

@grosser
Copy link
Contributor

grosser commented Oct 19, 2021

if you can somehow keep or lower the complexity and runtime then I'm in favor
(+tests mostly need to keep passing)
I'd not be a fan of tons of added complexity or significant runtime increase

@stoivo
Copy link
Contributor Author

stoivo commented Oct 19, 2021

Ok, thanks. I will see what I can do. I will add new spec too.

@stoivo
Copy link
Contributor Author

stoivo commented May 15, 2024

I put it on hold for a while, had to prioritize some other issues. Now I am back to it.

I found out that nokogiri has an issue related to searching for selectors like button:-moz-focusring. Nokogiri has an open issue to change css selector parser sparklemotion/nokogiri#2560. In this issue they mention two css parsers. I am thinking that css_parser should use one of those two parsers instead of writing our own. I haven't tested either jet. Would it be okey to use one of these two in css_parser. Any preference?

I think the name of this gem css_parser is a but missleading since it loads css sheets from urls, compine css sheets and more.

@grosser
Copy link
Contributor

grosser commented May 15, 2024

as long as nokogiri supports css selectors it's nice to reuse that
or are you suggesting to replace nokogiri completely ?
... if we are doing all the css parsing and not reusing nokogiri then reusing an existing css parser would be nice,
not sure how much work that would be, also need to keep the apis semi-stable and keep an eye on performance, so sounds like lots of work

renaming the gem is not really an option, would break too much I'd think

@stoivo
Copy link
Contributor Author

stoivo commented May 16, 2024

Sorry for beening a bit unclear.

I am suggesting to replace parse_block_into_rule_sets

def parse_block_into_rule_sets!(block, options = {}) # :nodoc:
current_media_queries = [:all]
if options[:media_types]
current_media_queries = options[:media_types].flatten.collect { |mt| CssParser.sanitize_media_query(mt) }
end
in_declarations = 0
block_depth = 0
in_charset = false # @charset is ignored for now
in_string = false
in_at_media_rule = false
in_media_block = false
current_selectors = String.new
current_media_query = String.new
current_declarations = String.new
# once we are in a rule, we will use this to store where we started if we are capturing offsets
rule_start = nil
offset = nil
block.scan(/\s+|\\{2,}|\\?[{}\s"]|.[^\s"{}\\]*/) do |token|
# save the regex offset so that we know where in the file we are
offset = Regexp.last_match.offset(0) if options[:capture_offsets]
if token.start_with?('"') # found un-escaped double quote
in_string = !in_string
end
if in_declarations > 0
# too deep, malformed declaration block
if in_declarations > 1
in_declarations -= 1 if token.include?('}')
next
end
if !in_string && token.include?('{')
in_declarations += 1
next
end
current_declarations << token
if !in_string && token.include?('}')
current_declarations.gsub!(/\}\s*$/, '')
in_declarations -= 1
current_declarations.strip!
unless current_declarations.empty?
if options[:capture_offsets]
add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries)
else
add_rule!(current_selectors, current_declarations, current_media_queries)
end
end
current_selectors = String.new
current_declarations = String.new
# restart our search for selectors and declarations
rule_start = nil if options[:capture_offsets]
end
elsif token =~ /@media/i
# found '@media', reset current media_types
in_at_media_rule = true
current_media_queries = []
elsif in_at_media_rule
if token.include?('{')
block_depth += 1
in_at_media_rule = false
in_media_block = true
current_media_queries << CssParser.sanitize_media_query(current_media_query)
current_media_query = String.new
elsif token.include?(',')
# new media query begins
token.tr!(',', ' ')
token.strip!
current_media_query << token << ' '
current_media_queries << CssParser.sanitize_media_query(current_media_query)
current_media_query = String.new
else
token.strip!
current_media_query << token << ' '
end
elsif in_charset or token =~ /@charset/i
# iterate until we are out of the charset declaration
in_charset = !token.include?(';')
elsif !in_string && token.include?('}')
block_depth -= 1
# reset the current media query scope
if in_media_block
current_media_queries = [:all]
in_media_block = false
end
elsif !in_string && token.include?('{')
current_selectors.strip!
in_declarations += 1
else
# if we are in a selector, add the token to the current selectors
current_selectors << token
# mark this as the beginning of the selector unless we have already marked it
rule_start = offset.first if options[:capture_offsets] && rule_start.nil? && token =~ /^[^\s]+$/
end
end
# check for unclosed braces
return unless in_declarations > 0
unless options[:capture_offsets]
return add_rule!(current_selectors, current_declarations, current_media_queries)
end
add_rule_with_offsets!(current_selectors, current_declarations, options[:filename], (rule_start..offset.last), current_media_queries)
end

with a gem. crass and syntax_tree-css just parses css sheets and give back tokens. We don't have parse the string but we can work with the structure they return. I had the issue that css_parser's parser don't parse not(.asd, .ass) correctly. If we use crass it it is parsed correctly. We still have to use nokogiri to query the document and inline the styles.

Related to the name. I didn't suggest renaming the gem. Move over I just wanted to mention that css_parser's scope is lot bigger then just parsing css style sheets. It filters by media, execute imports, normalizes rules.

@grosser
Copy link
Contributor

grosser commented May 16, 2024

  • if you can get a POC of that working and it's not too much change + performance downgrade then I'm not against
  • yeah the gem does a lot, maybe the readme needs to spell that out more

@stoivo
Copy link
Contributor Author

stoivo commented May 16, 2024

syntax_tree-css has an issues to parse selector like .a > .b so it's not an option ruby-syntax-tree/syntax_tree-css#25. Going forward with crass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants