Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read Binary String #89

Open
Jasmeet2011 opened this issue May 25, 2020 · 8 comments
Open

Unable to read Binary String #89

Jasmeet2011 opened this issue May 25, 2020 · 8 comments
Labels

Comments

@Jasmeet2011
Copy link

Jasmeet2011 commented May 25, 2020

Describe the bug

I am reading a Docx file saved as Blob field in Mysql database. The output from the Mysql table is in the form of a Binary String as extracted from "Event" of Logstash. I am able to write the binary string to a file and then read it using Docx. However, if i pass the data directly to Docx, it gives error.

To Reproduce

Steps to reproduce the behavior or put a short code to reproduce the bug.

example

require 'docx'
# I WRITE THE BINARY STRING TO A DOCX FILE AND READ IT
File.binwrite('c:\path\filename.doc', event.get('Blob field'))
doc = Docx::Document.new('/path/to/your/docx/filename.docx')
#ERROR--THIS DOES NOT WORK
doc = Docx::Document.new('event.get('Blob field'))
# TRIED TO CONVERT THE DATA TO A STRINGIO, BUT DID NOT WORK
file_to_read=StringIO.New(event.get('Blob field'))
doc = Docx::Document.new(file_to_read)

## Expected behavior

Is there a way to pass stringIO directly to Docx or any other way around to circumvent writing the file to Disk and then reading it.
Sorry for the wrong Label

## Environment
- Ruby version: [e.g 2.7.1]
- `docx` gem version: [e.g 0.5.0]
- Windows
@WaKeMaTTa
Copy link
Contributor

What event.get('Blob field') returns exactly ?

@Jasmeet2011
Copy link
Author

Thanks for the response.
As per the documentation of Logstash
Syntax: event.get(field)
Returns: Value for this field or nil if the field does not exist. Returned values could be a string, numeric or timestamp scalar value.

  • In my case, the field is a Blob stored in Mysql table. According to definition of Blob:

BLOB values are treated as binary strings (byte strings). They have the binary character set and collation, and comparison and sorting are based on the numeric values of the bytes in column values.
So event.get('Blob field') should return binary strings

@WaKeMaTTa
Copy link
Contributor

WaKeMaTTa commented May 25, 2020

@Jasmeet2011 can you provide a sample of your "binary string" ?

@Jasmeet2011
Copy link
Author

Jasmeet2011 commented May 25, 2020

I can read the binary string and write it as a Word Document. I can send the Word doc as read from the Event API however the binary string when written as a file using
File.binwrite('new.docx',event.get('resume')) #Where 'resume' is the field containing the Blob.(url)
can be read using Docx.
I don't know of any other way to copy the Binary string. Pl suggest.
new.docx
Copy of the file

@Jasmeet2011
Copy link
Author

So i managed to view part of the Blob data content
#<Sequel::SQL::Blob:0x840 bytes=7093 start="PK\x03\x04\x14\x00\b\b\b\x00" end="<\x02\x00\x00c\x19\x00\x00\x00\x

@satoryu
Copy link
Member

satoryu commented Jun 21, 2020

#ERROR--THIS DOES NOT WORK
doc = Docx::Document.new('event.get('Blob field'))

@Jasmeet2011 could you give us the error messages and backtraces appearing at this line?

@Jasmeet2011
Copy link
Author

sure, i will revert

@Jasmeet2011
Copy link
Author

Jasmeet2011 commented Jun 27, 2020

THIS DOES NOT WORK
doc= Docx::Document.new(event.get('resume'))

This is the Error I receive:

`][ERROR][logstash.filters.ruby    ] Ruby exception occurred: string contains null byte'

'C:/Users/sun/Downloads/elk/logstash-6.8.0/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated'
'{
    "first_name" => "Janine ",
           "dob" => 1980-01-03T18:30:00.000Z,
          "tags" => [
        [0] "_rubyexception"
    ],
         "email" => "janine.l@gmail.com\r",
      "@version" => "1",
    "@timestamp" => 2020-06-27T05:33:01.018Z,
            "id" => 4,
     "last_name" => "Labrune",
         "phone" => "(406) 785-5588",
          "type" => "docx",
        "resume" => #<Sequel::SQL::Blob:0x80a bytes=5568 start="PK\x03\x04\x14\x00\b\b\b\x00" end="<\x02\x00\x00n\x13\x00\x00\x00\x00">
}`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants