Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

[imap ] unknown token - "K" #2916

Open
waghanza opened this Issue · 33 comments

11 participants

Marwan Rabbâa Benjamin Klotz Federico Ravasio Sam Sebastian Wyder Yorick Peterse Marcus Rohrmoser Peter Marreck Ryan Johnson Jesse Cooke ErnstA
Marwan Rabbâa

Hi,

I have successfully install rubinus on my system :

  • Rubinius : rubinius 2.2.4 (2.1.0 fd07f670 2014-02-01 JI) [x86_64-linux-gnu]
  • OS : Fedora 20 x86_64
  • RVM/Rbenv : without
  • Rbx_Path : /opt/rbx/bin

I hava a basic script fetching my email (gmail powered), and it crash after

imap = Net::IMAP.new('imap.googlemail.com', 993, usessl = true, certs = nil, verify = false)

The stack is

An exception occurred running ./sms_sync.rb:

    unknown token - "K" (Net::IMAP::ResponseParseError)

Backtrace:

        Net::IMAP::ResponseParser#parse_error at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:3362
         Net::IMAP::ResponseParser#next_token at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:3279
          Net::IMAP::ResponseParser#lookahead at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:3224
              Net::IMAP::ResponseParser#match at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:3212
      Net::IMAP::ResponseParser#text_response at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:2720
  Net::IMAP::ResponseParser#response_untagged at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:2176
           Net::IMAP::ResponseParser#response at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:2132
              Net::IMAP::ResponseParser#parse at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:2058
                       Net::IMAP#get_response at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:1181
                         Net::IMAP#initialize at /opt/rbx/gems/gems/rubysl-net-imap-2.0.1/lib/rubysl/net/imap/imap.rb:1062
                            Object#__script__ at sms_sync.rb:23
             Rubinius::CodeLoader#load_script at kernel/delta/code_loader.rb:66
             Rubinius::CodeLoader.load_script at kernel/delta/code_loader.rb:140
                      Rubinius::Loader#script at kernel/loader.rb:649
                        Rubinius::Loader#main at kernel/loader.rb:831
Benjamin Klotz
Collaborator

Just rushed through it and found this:
on line 3236 is checked if @str contains BEG_REGEXP

if @str.index(BEG_REGEXP, @pos)
  @pos = $~.end(0)
  if $1
    return Token.new(T_SPACE, $+)
  elsif $2
    return Token.new(T_NIL, $+)
  elsif $3
    return Token.new(T_NUMBER, $+)
  elsif $4
    return Token.new(T_ATOM, $+)
  elsif $5
    return Token.new(T_QUOTED, $+.gsub(/\\(["\\])/n, "\\1"))
  elsif $6
    return Token.new(T_LPAR, $+)
  elsif $7
    return Token.new(T_RPAR, $+)
  elsif $8
    return Token.new(T_BSLASH, $+)
  elsif $9
    return Token.new(T_STAR, $+)
  elsif $10
    return Token.new(T_LBRA, $+)
  elsif $11
    return Token.new(T_RBRA, $+)
  elsif $12
    len = $+.to_i
    val = @str[@pos, len]
    @pos += len
    return Token.new(T_LITERAL, val)
  elsif $13
    return Token.new(T_PLUS, $+)
  elsif $14
    return Token.new(T_PERCENT, $+)
  elsif $15
    return Token.new(T_CRLF, $+)
  elsif $16
    return Token.new(T_EOF, $+)
  else
    parse_error("[Net::IMAP BUG] BEG_REGEXP is invalid")
  end
else
  @str.index(/\S*/n, @pos)
  parse_error("unknown token - %s", $&.dump)
end

@str.index(BEG_REGEXP, @pos) returns false so parse_error is called.

Dont know why this is happening exactly. Just as i said i rushed through it

Will look at it closer later

Benjamin Klotz
Collaborator

When you output @str it contains something like * OK Gimap ready for requests from 80.110.13.242 f8mb3483280eep

Maybe in BEG_REGEXP K should be added as token?
Dont got much knowledge of the class implementation yet but i think the string is correct..

Maybe anyone who has more understanding of the IMAP Class could take over from here

Benjamin Klotz
Collaborator

Looked at the IMAP class in ruby 2.1.0
Get the same output for @str (* OK Gimap ready for requests from .... ) in rbx 2.2.4 and mri 2.1.0.
But mri doesnt throw error.
So looked at differences in imap.rb but didn't find one.

The only logical assumption for me here is the different implementation for String#index.

https://gist.github.com/Benny1992/8865959

The return values of String#index are different.
Following output on MRI 2.1.0 for this gist:

return value: 0
return class: Fixnum
true or value

Output on rbx 2.2.4:

return value: 
return class: NilClass
false
Federico Ravasio

Yep, looks like Rubinius isn't matching "K", given the same regexp.

Federico Ravasio

The regexp is very big and complex. Could you try to reduce it further? It'd really help when debugging it. :)

Benjamin Klotz
Collaborator

Found a difference in the IMAP class

@str , @pos on mri 2.1.0

* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
0
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
1
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
2
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
4
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
63
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh
65
* OK Gimap ready for requests from 80.110.13.242 d4mb7335843eeh

@str , @pos in rbx 2.2.4

* OK Gimap ready for requests from 80.110.13.242 46mb7332423eee
0
* OK Gimap ready for requests from 80.110.13.242 46mb7332423eee
1
* OK Gimap ready for requests from 80.110.13.242 46mb7332423eee
2
* OK Gimap ready for requests from 80.110.13.242 46mb7332423eee
3
* OK Gimap ready for requests from 80.110.13.242 46mb7332423eee

Maybe this is another error? But an error on mri? should @pos be incremented by 1 each time or not?

Sorry for so many questions but im still a ruby internals beginner :)

Federico Ravasio

Could you take a look at https://github.com/rubysl/rubysl-net-imap and see if it's in sync with MRI's imap lib? Since it's part of stdlib, Rubinius fetches and uses it from there.
That said, there sure is a problem with Rubinius processing that regexp you've found.

Benjamin Klotz
Collaborator

diff returns nothing -> so no difference
The other files in rubysl-net-imap are just for requiring
So i think there is no difference

Federico Ravasio

Thanks for digging into this.
We'll try to fix the issue with the regexp; when it's done you can try your script again to see if it was enough or there's something else breaking it.

Benjamin Klotz
Collaborator

No Problem
If you need any help, let me know
Trying to simplify the regex when Im Home again :)

Benjamin Klotz
Collaborator

Modifying the regex results in following:

      BEG_REGEXP = /\G(?:\
(?# 1:  SPACE   )( +)|\
(?# 2:  NIL     )(NIL)(?=[\x80-\xff(){ \x00-\x1f\x7f%*#{'"'}\\\[\]+])|\
(?# 3:  NUMBER  )(\d+)(?=[\x80-\xff(){ \x00-\x1f\x7f%*#{'"'}\\\[\]+])|\
(?# 4:  ATOM    )([^\x80-\xff(){ \x00-\x1f\x7f%*#{'"'}\\\[\]+]+)|\
(?# 5:  QUOTED  )"((?:[^\x00\r\n"\\]|\\["\\])*)"|\
(?# 6:  LPAR    )(\()|\
(?# 7:  RPAR    )(\))|\
(?# 8:  BSLASH  )(\\)|\
(?# 9:  STAR    )(\*)|\
(?# 10: LBRA    )(\[)|\
(?# 11: RBRA    )(\])|\
(?# 12: LITERAL )\{(\d+)\}\r\n|\
(?# 13: PLUS    )(\+)|\
(?# 14: PERCENT )(%)|\
(?# 15: CRLF    )(\r\n)|\
(?# 16: EOF     )(\z))/ni

# (?# 3:  NUMBER  ) removed - MRI && rbx throws error
# (?# 4:  ATOM    ) removed - MRI && rbx throws error
# (?# 16: EOF     ) removed - both returns 0 (Fixnum)

\z Matches the end of the string
so obviously rbx got problem with the end of a string
ill look into the String#index implementation of rbx, maybe i find something :)

Sam

I too am having the same issue.

I'm running Redmine on Rubinius 2.2.5 and when I run the email rake tasks that use rubysl-net-imap-2.0.1, I get a parse error 'unknown token - "K"'

rubysl/rubysl-net-imap#1

Sam sammcj referenced this issue in rubysl/rubysl-net-imap
Open

Parse error 'unknown token - "K"' #1

Sebastian Wyder

Also having this issue.

Benjamin Klotz
Collaborator

Did some debugging and got stuck, maybe @razielgn or anyone could help me.

Rbx String#index is defined in https://github.com/rubinius/rubinius/blob/master/kernel/common/string.rb#L2224-L2246

There is checked if the string passed to #index is a regexp or string

For a regexp this will be executed https://github.com/rubinius/rubinius/blob/master/kernel/common/string.rb#L2234-L2246

Further i thought there has to be a mistake in Regexp#match_from because its returning nil/false for the given regexp.

Regexp#match_from is defined in https://github.com/rubinius/rubinius/blob/master/kernel/common/regexp.rb#L268-L271 and calling Regexp#search_region.

Regex#search_region is defined via a Rubinius.primitive.

I inserted some puts to check if the methods gets called.

In https://github.com/rubinius/rubinius/blob/master/kernel/common/regexp.rb

def match_from(str, count)
    puts "match_from #{str}, #{count}"
    return nil unless str
    search_region(str, count, str.bytesize, true)
  end

In https://github.com/rubinius/rubinius/blob/master/kernel/bootstrap/regexp.rb

  def search_region(str, start, finish, forward) # equiv to MRI's re_search
    puts "search_region #{str}, #{start}, #{finish}, #{forward}"
    Rubinius.primitive :regexp_search_region
    raise PrimitiveFailure, "Regexp#search_region primitive failed"
  end

The output for my new gist https://gist.github.com/Benny1992/8865959:

match_from /var/www/rubinius/runtime/gems/**/lib, 0
match_from /var/www/rubinius/runtime/gems/**/lib, 1
.
.
.
.
match_from /var/www/rubinius/gems/specifications/default/*.gemspec, 46
match_from K, 0
return value for regex: 
return class for regex: NilClass
return value for string: 0
return value for string: Fixnum

As you can see match_from with 'K' is called, but search_region not.

My question:

What exactly happens to a method which contains Rubinius.primitive method definition?
Does the compiled rbc file regexp.rbc contain the compiled method Regexp#search_region from the .cpp file.

Rubinius.primitive :regexp_search_region is defined in vm/gen/method_primitives.cpp.
Which doesn't exist in the git repo so i think it is generated somehow right?

Yorick Peterse

@Benny1992 The primitives are defined in C++ land in the following form:

// Rubinius.primitive :primitive_name_here

The primitives allow certain Ruby methods to call into corresponding C++ methods. They are processed during compilation but I can't remember the exact file that does that. They are basically turned into proper bytecode instructions (Rubinius.primitive isn't an actual Ruby method if I'm not mistaken).

In this case the corresponding C++ method is https://github.com/rubinius/rubinius/blob/a0e60d923fa8d11a82948fc8a13951b746a2c861/vm/builtin/regexp.hpp#L96 (implementation: https://github.com/rubinius/rubinius/blob/a0e60d923fa8d11a82948fc8a13951b746a2c861/vm/builtin/regexp.cpp#L450).

Benjamin Klotz
Collaborator

@YorickPeterse ah ok thx

Didn't see the comment above which is saying that search_region is corresponding to MatchData* match_region

For my personal learning purpose one question:

The .rb files and .cpp files get translated to Rubinius Bytecode and with the Rubinius.primitive "keyword" the proper method bytecode gets merged with the rb file into the .rbc file.

And this Bytecode finally is run through the vm right?

Yorick Peterse

Yes. The VM doesn't operate on Ruby source code directly, instead it operates on bytecode only. The Rubinius compiler (https://github.com/rubinius/rubinius-compiler) takes care of compiling Ruby source code into Rubinius bytecode.

Benjamin Klotz
Collaborator

Okay thanks :+1:

Marcus Rohrmoser
mro commented

Also having this issue.

Benjamin Klotz
Collaborator

@YorickPeterse
Got one more question (has nothing to do with this issue):

You said the rubinius-compiler compiles Ruby source into Rubinius bytecode
Where does rubinius-melbourne and the other stage gems (rubinius-ast, etc) get involved or is rubinius-compiler only a wrapper around the different stage gems

I'm kinda confused right now

Yorick Peterse

Melbourne & friends are used to parse a raw Ruby source file into a corresponding AST. This AST in turn is used by rubinius-compiler to generate the bytecode. The process of running Ruby code under Rubinius is roughly as following:

raw Ruby code -> Melbourne -> Compiler -> Virtual Machine
Benjamin Klotz
Collaborator

Okay i thought so but wasn't sure
Thanks to clear this up :+1:

Sam

I can confirm I still have this issue on Rubinius 2.2.6

Sebastian Wyder

Yup, still having it with 2.2.6.

Benjamin Klotz
Collaborator

Tracked this down to https://github.com/rubinius/rubinius/blob/master/kernel/bootstrap/regexp.rb#L18-L21

But this is a rbx c++ primitive, so debugging this is kinda hard for me.

Maybe anyone with enough c++ expertise could look at this?
I'm also interested in pair debugging this, to get an introduction to rbx c++ land :)

cc @dbussink, @razielgn, @YorickPeterse

Sam

This doesn't look to be fixed in 2.2.8

Yorick Peterse

@sammcj That could be correct since I don't recall anybody adding any specific patches for this bug.

Peter Marreck

FYI, saw this issue myself on Rubinius 2.2.10 when tinkering with rbx and some older IMAP code

Ryan Johnson

Any luck on this one? I am having the problem with rbx 2.2.10

Jesse Cooke
Owner

I don't recall if it's been fixed, but can you try on 2.4.1?

Benjamin Klotz
Collaborator

@ryan2johnson9, @jc00ke yep still present on 2.4.1:

Testscript: https://gist.github.com/bennyklotz/8865959

Ruby 2.2.0

rubinius-bugs ➤ ruby -v                                                                                                          git:master* mri-2.2.0
ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-linux]
rubinius-bugs ➤ ruby index_gist.rb                                                                                               git:master* mri-2.2.0
return value for regex: 0
return class for regex: Fixnum
return value for string: 0
return value for string: Fixnum

Rubinius 2.4.1

rubinius-bugs ➤ ruby -v                                                                                                                git:master* rbx
rubinius 2.4.1.n8 (2.1.0 56274d35 2015-01-08 3.5.0 JI) [x86_64-linux-gnu]
rubinius-bugs ➤ ruby index_gist.rb                                                                                                     git:master* rbx
return value for regex: 
return class for regex: NilClass
return value for string: 0
return value for string: Fixnum
ErnstA

Problem still occurs

rvm
rbx-head [ i686 ]
ruby -v
rubinius 2.5.0.n25 (2.1.0 af7eb1b 2015-01-25 3.4 JI) [i686-linux-gnu]

The regex shown above does not match in the same way as on MRI.
On MRI the test examples below pass.
The trouble on Rubinius is that the ATOM group matches instead of token "OK" only "O" !
I tried to narrow it down. What in the regex does Rubinius not understand?
With the range up to \xfd it still works but from \xfe onwards it fails!

require 'spec_helper'
describe 'https://github.com/rubinius/rubinius/issues/2916' do

 #     Failure/Error: expect(match_data[0]).to eq "OK2"
 #     expected: "OK2"
 #         got: "O"
 it "should find token OK2" do
    str = "* OK2 "
    regex = /[^\x80-\xfe ]+/ni
    str.index(regex,2)
    match_data = $~
    expect(match_data[0]).to eq "OK2"
  end

  it "should find token OK3" do
    str = "* OK3 "
    regex = /[^\x80-\xfd ]+/ni
    str.index(regex,2)
    match_data = $~
    expect(match_data[0]).to eq "OK3"
  end
end

Is there a serious problem with Rubinius regular expressions?
The title of this issue does not point to the problem therefore I open a new issue #3300 in the hope
that those interested in regular expressions will see it.

ErnstA

Until the regex problem is resolved I use a monkey patch on Net::IMAP and Net::IMAP::ResponseParser to avoid \xff and \xfe

So far this enabled me to connect to imap.gmail.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.