Comment.associate "merging" comments for identical nodes #184

kubicle · 2015-01-24T11:14:33Z

Hi and thanks for Parser, it works great!

I seem to have a problem with the comments "associate" method. For example the code below:

a=1 # first time
f()
a=1 # second time
f()

will give me the following mapping:

{(send nil :f)=>[
 #<Parser::Source::Comment test1.rb:2:5 "# first time">,
 #<Parser::Source::Comment test1.rb:4:5 "# second time">
]}

So it looks like both calls to "f" have 2 lines of comment instead of 1 line each.

The text was updated successfully, but these errors were encountered:

whitequark · 2015-01-24T11:22:08Z

Wow. That was a really dumb decision on my side. Indeed #associate is broken.

whitequark · 2015-01-24T11:29:48Z

@bbatsov @yujinakayama @yorickpeterse have you bumped into this bug? (How could you possibly have not?!)

What do you think is the best course of action here? On a second thought it looks like AST::Node#hash and AST::Node#eql? as currently defined are a bug factory. I want to switch them to compare-by-identity instead, but not sure yet what would be the consequences.

yorickpeterse · 2015-01-24T12:30:43Z

@whitequark Not that I know of. In my case I only use comments for method definitions where this problem is probably less/not likely to occur. Changing the workings of #hash and #eql? should be fine at least in my case, at least as long as two separate instances of nodes are still considered the same (I rely on that behaviour).

whitequark · 2015-01-24T12:49:55Z

No, they will not if I implement this change, which is why I am asking.

bbatsov · 2015-01-24T13:04:13Z

I concur that the current behaviour has to be changed. The impact to RuboCop's codebase should be minimal.

P.S. Believe it or not I've never noticed this so far (which of course means our tests could have been better).

whitequark · 2015-01-24T13:15:28Z

On reflection, I think that:

The change to ast would be incredibly hard to safely roll out, as most gems depend on ast transitively. It is probably not an option.
#associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

yorickpeterse · 2015-01-24T13:57:38Z

@whitequark To give some context, I depend on separate instances being considered equal in tests like this: https://github.com/YorickPeterse/oga/blob/f94461a9cadd3858bbf2bc5ee39484b865a5af96/spec/oga/xpath/parser/calls_spec.rb#L6

If two separate instances were no longer the same I'd have to fix a few thousand specifications spread across different projects.

whitequark · 2015-01-24T14:04:03Z

No, that's ==. I've never talked about changing ==, but only eql? and hash.

yorickpeterse · 2015-01-24T14:15:00Z

Ah ok, I thought it would apply to == as well. In that case this should be fine at least for me.

bbatsov · 2015-01-24T17:11:47Z

The change to ast would be incredibly hard to safely roll out, as most gems depend on ast transitively. It is probably not an option.

While I cannot no how everyone is using ast it seems unlikely that many clients will be affected by this (I still can't think of an usage in RuboCop where we're stuffing nodes in hashes) API-wise. I'd suggest releasing ast 4 and be done with it.

#associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

Leaving the broken associate around will likely lead to some confusion (as associate works most of the time). I think this is an example of a case when tying the parser version to the Ruby version limits our options when dealing with API changes.

Perhaps we should think harder about this, but in the end I'm fine with whatever solution you'd prefer.

/cc @jonas054

mbj · 2015-01-24T18:42:06Z

What do you think is the best course of action here? On a second thought it looks like AST::Node#hash and AST::Node#eql? as currently defined are a bug factory. I want to switch them to compare-by-identity instead, but not sure yet what would be the consequences.

For me the current definition of #eql? and #hash are perfect, they reflect value object semantics I'm using AST::Node instances with.

If they where redefined to work on the objects identity my use value-object use case in mutant would break.

I could fix this via using a Hash with custom (external #eql? / #hash) implementations, that reflect the current value object semantics. It would result in a small performance regression in mutant, nothing that would make me worry.

It all boils down to what semantics AST::Node objects should have by default, as @whitequark said:

#associate actually abuses the hash. Logically it returns not a hash table but a list of associations. So it probably should be fixed to return an array of comment-node pairs. As this changes the public API, a new function will have to be created.

I do no think the root cause here is the definition of AST::Node#eql? and friends, its the currently imperfect interface of #associate. So I'd prefer to get #associate interface changed. If that is not possible because of parsers release policy (I remember the version number should always equal the latest released ruby version?), there could be a temporal addition like #associate_list that does not break semver, and gets promoted to #associate with ruby / parser 3.0?

bbatsov · 2015-01-24T19:05:40Z

there could be a temporal addition like #associate_list that does not break semver, and gets promoted to #associate with ruby / parser 3.0?

Which might happen in 5 years...

For me the current definition of #eql? and #hash are perfect, they reflect value object semantics I'm using AST::Node instances with.

Thinking more about the problem I'm also inclined to believe that probably we shouldn't change anything in AST. Depending of the usage the current definitions can viewed as either good or bad and changing them won't really change this.

whitequark · 2015-01-24T19:36:56Z

I agree on ast. Let's figure out what to do with associate...

kubicle · 2015-01-27T08:16:00Z

Hi everyone and thank you VM for your help.
My 2 cents, since I coded a workaround in the meantime, and I thought it could do the trick if you plan to change associate's API:
I simply returned a similar hash but using node.location as the keys (instead of node).
This might look a bit hacky to you, though... However it is simpler to use than if associate was returning an array of comment-node pairs. Since most users of Parser will usually browse through the nodes, finding the associated comments is easy with a mapping node->comments (not with an array).

bbatsov · 2015-01-27T08:48:04Z

I agree on ast. Let's figure out what to do with associate...

That's a hard one. I can't think of a solution that is both elegant and won't break anything in the process.

whitequark · 2015-03-26T17:18:56Z

@yorickpeterse @bbatsov Please see the updated API in the master branch. Please also test and tell if it causes any trouble for you.

whitequark · 2015-03-26T17:19:07Z

cc @JoshCheek also

yorickpeterse · 2015-03-26T19:05:29Z

@whitequark Using the current master branch results in 263 failures out of 490 examples, current stable version of parser passes just fine. I'm getting a huge amount of errors such as can't modify frozen Parser::Source::Map::Definition. Currently the following code relies on the comment mapping: https://github.com/YorickPeterse/ruby-lint/blob/master/lib/ruby-lint/parser.rb#L41-L45

What do I have to change here to make things work again?

whitequark · 2015-03-27T08:15:24Z

@yorickpeterse After latest master commit should work out of the box.

JoshCheek · 2015-03-27T12:56:55Z

All my tests pass.

yorickpeterse · 2015-03-27T20:27:29Z

@whitequark Thanks, all tests pass again using the latest master commit.

See whitequark#184

See #184

kubicle mentioned this issue Mar 7, 2015

* Changed Comment::Associator#associate API and way of associating comments #188

Merged

whitequark closed this as completed in #188 Mar 26, 2015

marcandre added a commit to marcandre/parser that referenced this issue May 2, 2021

Add associate_by_identity as an alternate to associate

f6310f4

See whitequark#184

This was referenced May 2, 2021

+ Add associate_by_identity as an alternate to associate #798

Merged

ProcessedSource#ast_with_comments doesn't differentiate between identical nodes rubocop/rubocop-ast#179

Closed

iliabylich pushed a commit that referenced this issue May 2, 2021

+ Add associate_by_identity as an alternate to associate (#798)

edd0e47

See #184

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comment.associate "merging" comments for identical nodes #184

Comment.associate "merging" comments for identical nodes #184

kubicle commented Jan 24, 2015

whitequark commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

whitequark commented Jan 24, 2015

bbatsov commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

bbatsov commented Jan 24, 2015

mbj commented Jan 24, 2015

bbatsov commented Jan 24, 2015

whitequark commented Jan 24, 2015

kubicle commented Jan 27, 2015

bbatsov commented Jan 27, 2015

whitequark commented Mar 26, 2015

whitequark commented Mar 26, 2015

yorickpeterse commented Mar 26, 2015

whitequark commented Mar 27, 2015

JoshCheek commented Mar 27, 2015

yorickpeterse commented Mar 27, 2015

Comment.associate "merging" comments for identical nodes #184

Comment.associate "merging" comments for identical nodes #184

Comments

kubicle commented Jan 24, 2015

whitequark commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

whitequark commented Jan 24, 2015

bbatsov commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

whitequark commented Jan 24, 2015

yorickpeterse commented Jan 24, 2015

bbatsov commented Jan 24, 2015

mbj commented Jan 24, 2015

bbatsov commented Jan 24, 2015

whitequark commented Jan 24, 2015

kubicle commented Jan 27, 2015

bbatsov commented Jan 27, 2015

whitequark commented Mar 26, 2015

whitequark commented Mar 26, 2015

yorickpeterse commented Mar 26, 2015

whitequark commented Mar 27, 2015

JoshCheek commented Mar 27, 2015

yorickpeterse commented Mar 27, 2015