New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Ripper for lexical analyser #512

Merged
merged 151 commits into from Sep 12, 2017

Conversation

5 participants
@aycabta
Member

aycabta commented Sep 6, 2017

I replaced legacy lexical analyser from IRB with Ripper what is standard library of Ruby. It's important to support new syntax.

I explain what this code can parse Ruby code with checking diff for all Ruby documents and all Rails documents.

Preparing

Checkout each branches

mkdir test-for-rdoc
cd test-for-rdoc
git clone -b master --single-branch git@github.com:ruby/rdoc.git original-rdoc
cd original-rdoc
git checkout 0c5eb20 # This is HEAD of master as of now
bundle
cd ..
git clone -b use-ripper --single-branch git@github.com:aycabta/rdoc.git ripper-rdoc
cd ripper-rdoc
git checkout 23f12c1 # This is HEAD of use-ripper branch as of now
bundle
cd ..
git clone -b trunk --single-branch git@github.com:ruby/ruby.git
cd ruby
git checkout d68a6b3
cd ..
git clone -b master --single-branch git@github.com:rails/rails.git
cd rails
git checkout 665ac7c
cd ..

Generate CRuby documents

ruby -Ioriginal-rdoc/lib original-rdoc/exe/rdoc -o ruby-doc-original ruby
ruby -Iripper-rdoc/lib ripper-rdoc/exe/rdoc -https://github.com/zzak/sdoc/pull/112o ruby-doc-ripper ruby

Generate Rails documents

cd rials
# Tha patch below for removing monkey patch (https://github.com/zzak/sdoc/pull/112) and,
# using suitable variable (https://github.com/zzak/sdoc/pull/113) with
# suitable behavior of RDoc (https://github.com/ruby/rdoc/pull/502)
curl -fsSL https://gist.githubusercontent.com/aycabta/1d4eac210d3ede1b91a520804499a457/raw/ca90a8ce8fe3089a1d45629a01550012b2be46e1/sdoc.diff | patch -p1 
bundle update
RUBYOPT=-I../original-rdoc/lib bundle exec rake rdoc
mv doc/rdoc ../rails-doc-original
RUBYOPT=-I../ripper-rdoc/lib bundle exec rake rdoc
mv doc/rdoc ../rails-doc-ripper
cd ..

Check diff

Ruby

diff -ru --exclude "*.js" --exclude "*.gz" --exclude "*.css" --exclude "*.png" --exclude "*.gif" --exclude "*.ttf" rails-doc-original rails-doc-ripper | less

Rails

diff -ru --exclude "*.js" --exclude "*.gz" --exclude "*.css" --exclude "*.png" --exclude "*.gif" --exclude "*.ttf" ruby-doc-original ruby-doc-ripper | less

Explanation of remaining differences

These are unnecessary to fix.

Ruby

Exception2MessageMapper.html

  def bind(cl)
    self.module_eval "#{<<-"begin;"}\n#{<<-"end;"}", __FILE__, __LINE__+1
    begin;
      def Raise(err = nil, *rest)
        Exception2MessageMapper.Raise(self.class, err, *rest)
      end
      alias Fail Raise

      class << self
        undef included
      end
      def self.included(mod)
        mod.extend Exception2MessageMapper
      end
    end;
  end

I heard about this from @nobu, this evaluation code meta programming with complex heredoc is a trick for syntax highlighting of editor. This is strange and complex case, so I think that conforming output of tihs perfectly is unnecessary.

A method is listed at table_of_contents.html too.

alias

  • MonitorMixin.html
  • ruby/doc/syntax/miscellaneous_rdoc.html

For example,

alias :old_method :method

This is difficult to lexical analyse by IRB's one because it's not separated by comma. The alias syntax is special case of Ruby. In Ripper version, it becomes perfect.

ruby/doc/contributing_rdoc.html

git remote add my_fork git@github.com:my_username/ruby.git

Syntax highlighting for shell command as Ruby. RDoc has only Ruby syntax highlighting feature and it becomes normal quotation without syntex highlighting as Ruby when lexical analysing and parsing as Ruby fails with exception.

Regexp.html

def o.to_regexp() /foo/ end

A method definition one liner with regexp literal. It's so difficult to support this case, and if supports it, RDoc code becomes so complex. In Ripper version, it becomes perfect.

(GitHub syntax highlighting fails too...the correct syntax highlighted regexp literal is like below)

a =~ /foo/

Some commented out methods what are treated as document

I sent issues to ruby-core:

But in Ripper version, these skipped.

Rails

There is one difference.

--- rails-doc-original/classes/AbstractController/Callbacks/ClassMethods.html   2017-09-06 02:14:55.777896261 +0900
+++ rails-doc-ripper/classes/AbstractController/Callbacks/ClassMethods.html     2017-09-06 02:11:53.167905708 +0900
@@ -268,12 +268,14 @@
 <p>Note that <code>:only</code> has priority over <code>:if</code> in case
 they are used together.</p>
 
-<pre><code>only: :index, if: -&gt; { true } # the :if option will be ignored.</code></pre>
+<pre><code>only: :index, if: -&gt; { true } # the :if option will be ignored.
+</code></pre>

In original RDoc, the end of source code quotation is closed without newline. In Ripper RDoc, it's with newline. The IRB lexical analyser separates token to comment and the end of newline, but Ripper has the end of newline in a comment token. This is difficult to conform perfectly.

There are many cases of this in Rails diff, but these are the same as HTML rendering.

Excursus

I sent a patch for Ripper to ruby-core about lexical analysis state because this branch needed it, and I took some advices about lexical analysis state layer by @mrkn and I wrote RDoc::RipperStateLex for it. If the patch what is for Ripper is taken, I'll replace RDoc::RipperStateLex with the new Ripper after all old Ripper has gone to EOL.

aycabta added some commits Sep 5, 2017

Fix handling method parmeter
  def a, b, # comment
      c
  end

This is valid definition of method with a comment inside args.
This commit fixes behavior for it.
Support earlier than Ruby 2.4
Remove String#match? and heredoc with ">>~".
@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 6, 2017

Member

I warry about Rails using forked SDoc and it has wrong behavior (those are fixed original repository of SDoc, 1 and 2).

@zzak: I want to join SDoc collaborators and RubyGems owners for fix some trouble in the future, and it's just an aside, let's talk about some documentations in the future.

@amatsuda and/or other Rails committers: This SDoc problem is so severe for replacement lexical analyser with Ripper because RDoc crashes with this Pull Request certainly until now, especially about this monkey patch. I think that Rails should use original SDoc as new version.

Member

aycabta commented Sep 6, 2017

I warry about Rails using forked SDoc and it has wrong behavior (those are fixed original repository of SDoc, 1 and 2).

@zzak: I want to join SDoc collaborators and RubyGems owners for fix some trouble in the future, and it's just an aside, let's talk about some documentations in the future.

@amatsuda and/or other Rails committers: This SDoc problem is so severe for replacement lexical analyser with Ripper because RDoc crashes with this Pull Request certainly until now, especially about this monkey patch. I think that Rails should use original SDoc as new version.

@drbrain

This comment has been minimized.

Show comment
Hide comment
@drbrain

drbrain Sep 6, 2017

Member

My years-long dream come true

Member

drbrain commented Sep 6, 2017

My years-long dream come true

} && $~
line
end.join("\n")
end

This comment has been minimized.

@drbrain

drbrain Sep 6, 2017

Member

Too bad expand_tab doesn't support @options.tab_width

@drbrain

drbrain Sep 6, 2017

Member

Too bad expand_tab doesn't support @options.tab_width

This comment has been minimized.

@aycabta
@aycabta
@amatsuda

This comment has been minimized.

Show comment
Hide comment
@amatsuda

amatsuda Sep 6, 2017

Member

No worries about the sdoc thing. We can switch to any version so far as it produces correct output.

Member

amatsuda commented Sep 6, 2017

No worries about the sdoc thing. We can switch to any version so far as it produces correct output.

@hsbt hsbt self-requested a review Sep 7, 2017

@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 7, 2017

Member

Everyone, please test this branch for documentation to your software and check diff between original and this one.

Member

aycabta commented Sep 7, 2017

Everyone, please test this branch for documentation to your software and check diff between original and this one.

@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 7, 2017

Member

I wrote doctest.sh for checking diff between original RDoc and Ripper version RDoc of specified GitHub repository.

$ git clone git@gist.github.com:24a122ba798337cc3b9a7bfb32fc28f4.git doctest
$ cd doctest
$ ./doctest.sh
Usage: ./doctest.sh username/repository
$ ./doctest.sh ruby/ruby
# run diff command with less command
Member

aycabta commented Sep 7, 2017

I wrote doctest.sh for checking diff between original RDoc and Ripper version RDoc of specified GitHub repository.

$ git clone git@gist.github.com:24a122ba798337cc3b9a7bfb32fc28f4.git doctest
$ cd doctest
$ ./doctest.sh
Usage: ./doctest.sh username/repository
$ ./doctest.sh ruby/ruby
# run diff command with less command
@robin850

This comment has been minimized.

Show comment
Hide comment
@robin850

robin850 Sep 8, 2017

Contributor

Awesome work @aycabta ! 👏 ❤️

Rails is currently pointing to a fork of SDoc because the API site relies on unreleased changed and we are waiting for 1.0.0 final to be cut before updating the Gemfile but no worries, this is temporary.

By the way, the generation time is reduced by ~10s in the Rails repository with this patch ; great ! As you mentioned earlier, the outputs aren't exactly the same but apart from different code highlighting, everything seems fine regarding Rails.

Contributor

robin850 commented Sep 8, 2017

Awesome work @aycabta ! 👏 ❤️

Rails is currently pointing to a fork of SDoc because the API site relies on unreleased changed and we are waiting for 1.0.0 final to be cut before updating the Gemfile but no worries, this is temporary.

By the way, the generation time is reduced by ~10s in the Rails repository with this patch ; great ! As you mentioned earlier, the outputs aren't exactly the same but apart from different code highlighting, everything seems fine regarding Rails.

@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 8, 2017

Member

Rails is currently pointing to a fork of SDoc because the API site relies on unreleased changed and we are waiting for 1.0.0 final to be cut before updating the Gemfile but no worries, this is temporary.

OK, I understand what is going on.

everything seems fine regarding Rails.

Thank you, I think that this Pull Request is practically settled.

Member

aycabta commented Sep 8, 2017

Rails is currently pointing to a fork of SDoc because the API site relies on unreleased changed and we are waiting for 1.0.0 final to be cut before updating the Gemfile but no worries, this is temporary.

OK, I understand what is going on.

everything seems fine regarding Rails.

Thank you, I think that this Pull Request is practically settled.

@hsbt

This comment has been minimized.

Show comment
Hide comment
@hsbt

hsbt Sep 11, 2017

Member

Some commented out methods what are treated as document

I merged these patches into ruby core.

Member

hsbt commented Sep 11, 2017

Some commented out methods what are treated as document

I merged these patches into ruby core.

@hsbt

This comment has been minimized.

Show comment
Hide comment
@hsbt

hsbt Sep 11, 2017

Member

I confirmed that differences of rdoc documents generated old-rdoc and ripper version. I didn't find another difference without Explanation of remaining differences.

I approved to merge this pull request and release 6.0.0.beta2 before RubyKaigi 2017.

@aycabta How do you think about this release?

Member

hsbt commented Sep 11, 2017

I confirmed that differences of rdoc documents generated old-rdoc and ripper version. I didn't find another difference without Explanation of remaining differences.

I approved to merge this pull request and release 6.0.0.beta2 before RubyKaigi 2017.

@aycabta How do you think about this release?

@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 11, 2017

Member

@hsbt You sure can, that's why I wrote all code.

Member

aycabta commented Sep 11, 2017

@hsbt You sure can, that's why I wrote all code.

@hsbt

hsbt approved these changes Sep 12, 2017

@hsbt hsbt merged commit e96d747 into ruby:master Sep 12, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@aycabta aycabta deleted the aycabta:use-ripper branch Sep 12, 2017

@aycabta

This comment has been minimized.

Show comment
Hide comment
@aycabta

aycabta Sep 12, 2017

Member

💕 🌸 💞

Member

aycabta commented Sep 12, 2017

💕 🌸 💞

matzbot pushed a commit to ruby/ruby that referenced this pull request Sep 12, 2017

Merge rdoc-6.0.0.beta2 from upstream.
  * This version changed lexer used Ripper from lexer based IRB.
    see details: ruby/rdoc#512

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59845 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

@aycabta aycabta referenced this pull request Sep 19, 2017

Merged

Ruby 2.5.0-preview1 Released #1641

2 of 2 tasks complete
@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost commented Oct 11, 2017

🌸

@headius headius referenced this pull request Dec 1, 2017

Closed

Ruby 2.5 Support #4876

64 of 75 tasks complete

robin850 added a commit to robin850/rdoc that referenced this pull request Jan 25, 2018

Fix `#tokens_to_s` method for token streams
Tiny follow-up to #512. As the token stream isn't represented by an
array of token classes but rather with an array of hashes, the token's
text is accessible through a key and not a method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment