Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment.associate "merging" comments for identical nodes #184

Closed
kubicle opened this issue Jan 24, 2015 · 21 comments · Fixed by #188
Closed

Comment.associate "merging" comments for identical nodes #184

kubicle opened this issue Jan 24, 2015 · 21 comments · Fixed by #188

Comments

@kubicle
Copy link
Contributor

kubicle commented Jan 24, 2015

Hi and thanks for Parser, it works great!

I seem to have a problem with the comments "associate" method. For example the code below:

a=1 # first time
f()
a=1 # second time
f()

will give me the following mapping:

{(send nil :f)=>[
 #<Parser::Source::Comment test1.rb:2:5 "# first time">,
 #<Parser::Source::Comment test1.rb:4:5 "# second time">
]}

So it looks like both calls to "f" have 2 lines of comment instead of 1 line each.

@whitequark
Copy link
Owner

Wow. That was a really dumb decision on my side. Indeed #associate is broken.

@whitequark
Copy link
Owner

@bbatsov @yujinakayama @yorickpeterse have you bumped into this bug? (How could you possibly have not?!)

What do you think is the best course of action here? On a second thought it looks like AST::Node#hash and AST::Node#eql? as currently defined are a bug factory. I want to switch them to compare-by-identity instead, but not sure yet what would be the consequences.

@yorickpeterse
Copy link
Contributor

@whitequark Not that I know of. In my case I only use comments for method definitions where this problem is probably less/not likely to occur. Changing the workings of #hash and #eql? should be fine at least in my case, at least as long as two separate instances of nodes are still considered the same (I rely on that behaviour).

@whitequark
Copy link
Owner

No, they will not if I implement this change, which is why I am asking.

@bbatsov
Copy link
Contributor

bbatsov commented Jan 24, 2015

I concur that the current behaviour has to be changed. The impact to RuboCop's codebase should be minimal.

P.S. Believe it or not I've never noticed this so far (which of course means our tests could have been better).

@whitequark
Copy link
Owner

On reflection, I think that:

  • The change to ast would be incredibly hard to safely roll out, as most gems depend on ast transitively. It is probably not an option.
  • #associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

@yorickpeterse
Copy link
Contributor

@whitequark To give some context, I depend on separate instances being considered equal in tests like this: https://github.com/YorickPeterse/oga/blob/f94461a9cadd3858bbf2bc5ee39484b865a5af96/spec/oga/xpath/parser/calls_spec.rb#L6

If two separate instances were no longer the same I'd have to fix a few thousand specifications spread across different projects.

@whitequark
Copy link
Owner

No, that's ==. I've never talked about changing ==, but only eql? and hash.

@yorickpeterse
Copy link
Contributor

Ah ok, I thought it would apply to == as well. In that case this should be fine at least for me.

@bbatsov
Copy link
Contributor

bbatsov commented Jan 24, 2015

The change to ast would be incredibly hard to safely roll out, as most gems depend on ast transitively. It is probably not an option.

While I cannot no how everyone is using ast it seems unlikely that many clients will be affected by this (I still can't think of an usage in RuboCop where we're stuffing nodes in hashes) API-wise. I'd suggest releasing ast 4 and be done with it.

#associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

Leaving the broken associate around will likely lead to some confusion (as associate works most of the time). I think this is an example of a case when tying the parser version to the Ruby version limits our options when dealing with API changes.

Perhaps we should think harder about this, but in the end I'm fine with whatever solution you'd prefer.

/cc @jonas054

@mbj
Copy link
Collaborator

mbj commented Jan 24, 2015

What do you think is the best course of action here? On a second thought it looks like AST::Node#hash and AST::Node#eql? as currently defined are a bug factory. I want to switch them to compare-by-identity instead, but not sure yet what would be the consequences.

For me the current definition of #eql? and #hash are perfect, they reflect value object semantics I'm using AST::Node instances with.

If they where redefined to work on the objects identity my use value-object use case in mutant would break.

I could fix this via using a Hash with custom (external #eql? / #hash) implementations, that reflect the current value object semantics. It would result in a small performance regression in mutant, nothing that would make me worry.

It all boils down to what semantics AST::Node objects should have by default, as @whitequark said:

#associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

I do no think the root cause here is the definition of AST::Node#eql? and friends, its the currently imperfect interface of #associate. So I'd prefer to get #associate interface changed. If that is not possible because of parsers release policy (I remember the version number should always equal the latest released ruby version?), there could be a temporal addition like #associate_list that does not break semver, and gets promoted to #associate with ruby / parser 3.0?

@bbatsov
Copy link
Contributor

bbatsov commented Jan 24, 2015

there could be a temporal addition like #associate_list that does not break semver, and gets promoted to #associate with ruby / parser 3.0?

Which might happen in 5 years...

For me the current definition of #eql? and #hash are perfect, they reflect value object semantics I'm using AST::Node instances with.

Thinking more about the problem I'm also inclined to believe that probably we shouldn't change anything in AST. Depending of the usage the current definitions can viewed as either good or bad and changing them won't really change this.

@whitequark
Copy link
Owner

I agree on ast. Let's figure out what to do with associate...

@kubicle
Copy link
Contributor Author

kubicle commented Jan 27, 2015

Hi everyone and thank you VM for your help.
My 2 cents, since I coded a workaround in the meantime, and I thought it could do the trick if you plan to change associate's API:
I simply returned a similar hash but using node.location as the keys (instead of node).
This might look a bit hacky to you, though... However it is simpler to use than if associate was returning an array of comment-node pairs. Since most users of Parser will usually browse through the nodes, finding the associated comments is easy with a mapping node->comments (not with an array).

@bbatsov
Copy link
Contributor

bbatsov commented Jan 27, 2015

I agree on ast. Let's figure out what to do with associate...

That's a hard one. I can't think of a solution that is both elegant and won't break anything in the process.

@whitequark
Copy link
Owner

@yorickpeterse @bbatsov Please see the updated API in the master branch. Please also test and tell if it causes any trouble for you.

@whitequark
Copy link
Owner

cc @JoshCheek also

@yorickpeterse
Copy link
Contributor

@whitequark Using the current master branch results in 263 failures out of 490 examples, current stable version of parser passes just fine. I'm getting a huge amount of errors such as can't modify frozen Parser::Source::Map::Definition. Currently the following code relies on the comment mapping: https://github.com/YorickPeterse/ruby-lint/blob/master/lib/ruby-lint/parser.rb#L41-L45

What do I have to change here to make things work again?

@whitequark
Copy link
Owner

@yorickpeterse After latest master commit should work out of the box.

@JoshCheek
Copy link
Collaborator

All my tests pass.

@yorickpeterse
Copy link
Contributor

@whitequark Thanks, all tests pass again using the latest master commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants