Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Client#readlines is broken for files bigger than chunksize #3

Closed
wants to merge 2 commits into from

2 participants

Andy Lindeman Brian Muller
Andy Lindeman

Also added a Gemfile and some tests. You can see the brokenness if you run the tests against the original code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 2 unique commits by 1 author.

Oct 22, 2011
Andy Lindeman alindeman Adds Gemfile 964425b
Andy Lindeman alindeman Fixes bug with #readlines 5efa489
This page is out of date. Refresh to see the latest.
3  .gitignore
... ... @@ -1,3 +1,4 @@
1 1 docs
2 2 lib
3   -pkg
  3 +pkg
  4 +Gemfile.lock
3  Gemfile
... ... @@ -0,0 +1,3 @@
  1 +source :rubygems
  2 +
  3 +gemspec
1  ganapati.gemspec
@@ -19,4 +19,5 @@ Gem::Specification.new do |s|
19 19 s.executables << 'hcp'
20 20 s.rubyforge_project = "ganapati"
21 21 s.add_dependency('thrift', '>= 0.5.0')
  22 + s.add_development_dependency('mocha', '>=0.10.0')
22 23 end
20 lib/ganapati/client.rb
@@ -53,19 +53,19 @@ def readchunks(path, chunksize=1048576)
53 53 end
54 54
55 55 def readlines(path, sep="\n")
56   - lastbit = ""
  56 + buffer = ""
  57 +
57 58 readchunks(path) { |chunk|
58   - parts = chunk.split(sep)
59   - if parts.length == 0
60   - yield lastbit if lastbit != ""
61   - elsif parts.length == 1
62   - yield lastbit + parts.first
63   - else
64   - yield lastbit + parts.first
65   - parts.slice(1, parts.length).each { |p| yield p }
  59 + buffer << chunk
  60 +
  61 + # partitions[1] will be empty if sep does not exist in the string
  62 + while partitions = buffer.partition(sep) and !partitions[1].empty?
  63 + yield partitions.first
  64 + buffer = partitions.last
66 65 end
67   - lastbit = ""
68 66 }
  67 +
  68 + yield buffer if buffer.size > 0
69 69 end
70 70
71 71 # for writing to a new file
71 test/client_test.rb
... ... @@ -0,0 +1,71 @@
  1 +require "test_helper"
  2 +
  3 +class ClientTest < Test::Unit::TestCase
  4 + def stub_thrift!
  5 + @socket = stub_everything("socket")
  6 + Thrift::Socket.stubs(:new).returns(stub)
  7 +
  8 + @transport = stub_everything("transport")
  9 + Thrift::BufferedTransport.stubs(:new).returns(@transport)
  10 +
  11 + @protocol = stub_everything("protocol")
  12 + Thrift::BinaryProtocol.stubs(:new).returns(@protocol)
  13 +
  14 + @client = stub_everything("client")
  15 + ThriftHadoopFileSystem::Client.stubs(:new).returns(@client)
  16 + end
  17 +
  18 + def setup
  19 + stub_thrift!
  20 + end
  21 +
  22 + def test_readlines_with_one_chunk
  23 + ganapati = Ganapati::Client.new("127.0.0.1", 1234)
  24 + ganapati.stubs(:readchunks).yields("abc123")
  25 +
  26 + lines = []
  27 + ganapati.readlines("/foo") { |line| lines << line }
  28 +
  29 + assert_equal ["abc123"], lines
  30 + end
  31 +
  32 + def test_readlines_with_two_chunks
  33 + ganapati = Ganapati::Client.new("127.0.0.1", 1234)
  34 + ganapati.stubs(:readchunks).multiple_yields(["abc\n123"], ["456"])
  35 +
  36 + lines = []
  37 + ganapati.readlines("/foo") { |line| lines << line }
  38 +
  39 + assert_equal ["abc", "123456"], lines
  40 + end
  41 +
  42 + def test_readlines_with_chunks_with_multiple_lines
  43 + ganapati = Ganapati::Client.new("127.0.0.1", 1234)
  44 + ganapati.stubs(:readchunks).multiple_yields(["abc\n123\n456\n"], ["789"])
  45 +
  46 + lines = []
  47 + ganapati.readlines("/foo") { |line| lines << line }
  48 +
  49 + assert_equal ["abc", "123", "456", "789"], lines
  50 + end
  51 +
  52 + def test_readlines_with_middle_chunk
  53 + ganapati = Ganapati::Client.new("127.0.0.1", 1234)
  54 + ganapati.stubs(:readchunks).multiple_yields(["abc\n123"], ["456"], ["789"])
  55 +
  56 + lines = []
  57 + ganapati.readlines("/foo") { |line| lines << line }
  58 +
  59 + assert_equal ["abc", "123456789"], lines
  60 + end
  61 +
  62 + def test_readlines_with_sep_character_at_the_end_of_chunks
  63 + ganapati = Ganapati::Client.new("127.0.0.1", 1234)
  64 + ganapati.stubs(:readchunks).multiple_yields(["abc\n"], ["123\n"], ["456\n"])
  65 +
  66 + lines = []
  67 + ganapati.readlines("/foo") { |line| lines << line }
  68 +
  69 + assert_equal ["abc", "123", "456"], lines
  70 + end
  71 +end
7 test/test_helper.rb
... ... @@ -0,0 +1,7 @@
  1 +require "test/unit"
  2 +
  3 +require "rubygems"
  4 +require "mocha"
  5 +
  6 +$LOAD_PATH << File.expand_path("../lib", File.dirname(__FILE__))
  7 +require "ganapati"

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.