Skip to content

Commit

Permalink
marisa: Finished with the last bit of cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
pgaskin committed Mar 24, 2020
1 parent 0239665 commit 5bfa4c5
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 4 deletions.
1 change: 0 additions & 1 deletion marisa/marisa.cc
@@ -1,6 +1,5 @@
#include <cstdlib>
#include <cstring>
#include <exception>
#include <stdexcept>
#include <string>

Expand Down
6 changes: 3 additions & 3 deletions marisa/shim.h
Expand Up @@ -96,8 +96,8 @@ class iopbuf : public std::basic_streambuf<char> {

this->setg(&this->rbuf_, &this->rbuf_, &this->rbuf_ + (n>0 ? n : 0)); // Update the current byte.
return this->gptr() == this->egptr() // If the new current pos == past end of buffer, no byte was read (n<=0).
? iopbuf::traits_type::eof() // If no byte was read (and no error was thrown earlier), it's an EOF.
: iopbuf::traits_type::to_int_type(this->rbuf_); // Otherwise, return the byte we just read (note: without to_int_type, 0xFF would be sign extended to -1/eof).
? iopbuf::traits_type::eof() // If no byte was read (and no error was thrown earlier), it's an EOF.
: iopbuf::traits_type::to_int_type(this->rbuf_); // Otherwise, return the byte we just read (note: without to_int_type, 0xFF would be sign extended to -1/eof).
}

std::streamsize xsgetn(iopbuf::char_type* buf, std::streamsize buf_n) override {
Expand All @@ -113,7 +113,7 @@ class iopbuf : public std::basic_streambuf<char> {
this->rbuf_ = n>0 ? buf[n-1] : 0; // Set the current byte to the last one read, if any.
this->setg(&this->rbuf_, &this->rbuf_, &this->rbuf_ + (n>0 ? 1 : 0)); // Update the current byte.
return this->gptr() == this->egptr() // If the new current pos == past end of buffer, no byte was read (n<=0).
? iopbuf::traits_type::eof() // If no byte was read (and no error was thrown earlier), it's an EOF
? iopbuf::traits_type::eof() // If no byte was read (and no error was thrown earlier), it's an EOF
: n; // Otherwise, return the number of bytes read.
}

Expand Down

2 comments on commit 5bfa4c5

@pgaskin
Copy link
Owner Author

@pgaskin pgaskin commented on 5bfa4c5 Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NiLuJe, you might find the shim stuff (shim.go, shim.h) interesting (disclaimer: it's my first time implementing custom C++ streams, so feel free to correct anything if you have more experience with them).

@pgaskin
Copy link
Owner Author

@pgaskin pgaskin commented on 5bfa4c5 Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed I made a mistake somewhere:

Loading dictzip /home/patrick/Documents/Books/Books/.kobodict/webster1913.zip.
Error: load dictionary: read dictionary: read words index: marisa: libmarisa.cc:1265: MARISA_IO_ERROR: !stream_->read(static_cast<char *>(buf), static_cast<std::streamsize>(size))
exit status 1

I think something's sending an EOF too early. I'll look into it myself tomorrow.

Update: It's actually a pretty simple issue. Marisa expects xsgetn to either read everything, nothing, or error. The same seems to be true about quite a few other C++ libs. All I need to do is make it loop the way the default implementation does.

Please sign in to comment.