Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Capacity problem" - or problem in Protobuf logic #26

Closed
mbflex opened this issue May 21, 2013 · 15 comments
Closed

"Capacity problem" - or problem in Protobuf logic #26

mbflex opened this issue May 21, 2013 · 15 comments
Assignees
Labels

Comments

@mbflex
Copy link

mbflex commented May 21, 2013

Running ProtoBuf.js in Real Life with real data results sometimes in an error message:

Cannot read uint8 from ByteBuffer(offset=642,markedOffset=-1,length=644,capacity
=644): Capacity overflow.

The problem clearly depends on the data which are handled. Anyhow, the reason and how to solve it is open, unfortunately.

@mbflex
Copy link
Author

mbflex commented May 21, 2013

The underlying buffer seems to be resized on the fly when data are written into the ByteBuffer.
Because this works fine, there should no real capacity problem at all

The problem is more that ProtoBuf try to read stuff which was never written into the Buffer. Seems to be a problem in the underlying logic in ProtoBuf?

@dcodeIO
Copy link
Member

dcodeIO commented May 21, 2013

Unsure what's going on there as offset=642 with length/capacity=644 should not throw an exception (still 2 uint8s left). Are you using the latest version of ByteBuffer.js? Edit: Have you tried BB 1.3.6?

@dcodeIO
Copy link
Member

dcodeIO commented May 21, 2013

I've added a bit more information to the exception message in BB 1.3.6 (now on NPM). It now also names the actual offset that's being accessed.

@ghost ghost assigned dcodeIO May 21, 2013
@mbflex
Copy link
Author

mbflex commented May 22, 2013

I updated to latest version (just now). Full error stack is now:
Error: Cannot read uint8 from ByteBuffer(offset=194,markedOffset=-1,length=644,c
apacity=644) at 644: Capacity overflow
at ByteBuffer.readUint8 (..\protobufjs\node_modules\bytebuffer\ByteBuffer.js:607:23)
at Function.ByteBuffer.decodeUTF8Char (..\protobufjs\node_modules\bytebuffer\ByteBuffer.js:1423:25)
at ByteBuffer.readUTF8StringBytes (..\protobufjs\node_modules\bytebuffer\ByteBuffer.js:1633:34)
at ByteBuffer.readVString (..\protobufjs\node_modules\bytebuffer\ByteBuffer.js:1727:28)
at ProtoBuf.Reflect.Field.decode (..\protobufjs\ProtoBuf.js:2026:35)
at ProtoBuf.Reflect.Message.decode (..\protobufjs\ProtoBuf.js:1611:51)
at ProtoBuf.Reflect.Field.decode (..\protobufjs\ProtoBuf.js:2041:46)
at ProtoBuf.Reflect.Message.decode (..\protobufjs\ProtoBuf.js:1609:51)
at ProtoBuf.Reflect.Field.decode (..\protobufjs\ProtoBuf.js:2041:46)
at ProtoBuf.Reflect.Message.decode (..\protobufjs\ProtoBuf.js:1611:51)

@mbflex
Copy link
Author

mbflex commented May 22, 2013

I am using Protobufjs in a loop, like
"all each 5 seconds: read GTFS (ProtoBuf) File from WebServer, decode it ... read again .."

Typically: it works for a while (or even just for the very first one), and than it crashes during an other run.
The problem seems to be the loop, which seems to be not "allowed".

What is the problem with the loop?

@mbflex
Copy link
Author

mbflex commented May 22, 2013

Finally, I found that the problem is not the decoding process but the HTTP request - after a while, the remote server starts to use "Chunked transfer encoding" - which was not handled correctly in my code. So, protobufjs works fine but needs a valid data input ;-)

@dcodeIO
Copy link
Member

dcodeIO commented May 22, 2013

Maybe I could add some sort of #decodeFromUrl(...) or something. Would you share your code?

@mbflex
Copy link
Author

mbflex commented May 23, 2013

I am now using this code (but I am still not sure whether this 100% correct or not):

var ProtoBuf = require("protobufjs");
var http = require('http');
var configServer = {host: '.........', port:81, path:"/gtfs", method:'GET'};
var configFrequencySeconds = 3;
var myDecoder = ProtoBuf.protoFromFile("conf/gtfs-realtime.proto").build("transit_realtime").FeedMessage;
readGtfs();

function readGtfs()
{
var data = 0;
var req = http.request(configServer, function(res)
{
res.on('data', function (chunk)
{
if (!data) data = chunk;else data += chunk;
});
res.on('end', function (){
var feed = myDecoder.decode(data);
... < work with decoded data >
});
});
req.on('error', function(e) {console.log('ERROR: problem with request: ' + e.message);});
req.end();
setTimeout(readGtfs, configFrequencySeconds*1000);
}

@dcodeIO dcodeIO closed this as completed May 25, 2013
@saccodd
Copy link

saccodd commented Dec 27, 2013

Hi,
I am working on a Node.js solution to retrieve and process GTFS-realtime.
Currently my code is based on yours and works well on Trimet feeds and Bart feeds.
However it doesn't on others that I tested. Specifically, I would focus on VehiclePosition feeds provided by MBTA and I cannot (http://developer.mbta.com/lib/gtrtfs/Vehicles.pb).
There is a kind of mismatch between the proto file and the feed provided by them.
One field is always missing, typically the header.
I have tried to request the data every 3 seconds, but the issue keeps being not solved: I never retrieve a complete feed.
This problem drives me crazy! May be I miss something in my code? or may be their feed is corrupted? or something goes wrong in ProtoBuf.js implementation?
Any help from you is really appreciated! I would like to hear some experiences from you.
Thank you very much.
Daniele

@dcodeIO
Copy link
Member

dcodeIO commented Dec 27, 2013

One thing you could try is to reverse engineer the Vehicles.pb to validate that it actually matches the proto definition:

That'd be the obvious reason for failure.

If you assume a bug in ProtoBuf.js, any additional information would be useful, like errors thrown or a break down of the data and proto file to a minimal failing case.

@dcodeIO
Copy link
Member

dcodeIO commented Dec 27, 2013

Another point of error could be that somewhere between requesting and parsing the Vehicles.pb there is a string conversion, which is bad. This will most likely corrupt the data. See:

Something like this is also required on node, like working with buffers instead of strings when fetching the data, maybe forcing Content-Type: application/octet-stream. To validate this case, download the Vehicles.pb to your hard drive and load it through fs.readFileSync, then try to YourMessage#decode it without any additional conversion. If this works and the remotely fetched does not, something is wrong in between.

@saccodd
Copy link

saccodd commented Dec 28, 2013

The only conversion I have is from binary to ascii because ProtoBuf needs a valid base64 encoded string.
By the way, here you have my code:

var http = require('http');
var configServer = {host: 'developer.mbta.com', path: '/lib/gtrtfs/Vehicles.pb'};

var configFrequencySeconds = 3; 
var myDecoder = ProtoBuf.protoFromFile(path.join(__dirname, "www", "gtfs-realtime.proto")).build("transit_realtime").FeedMessage;
readGtfs();

function readGtfs() {
var data = 0;
var req = http.request(configServer, function(res) {
    //res.responseType="arraybuffer";
    res.on('data', function (chunk) {
        if (!data) data = chunk; else data += chunk; 
    });
    res.on('end', function (){ 
        database64 = btoa(data);
        try {
            var feed = myDecoder.decode(database64);
            console.log("Test 1.1 " + JSON.stringify(feed, null, 4));
            console.log("Test 1.2 " + feed.entity[1].id);
        } catch (e) {
            if (e.decoded) { // Truncated
                feed = e.decoded; // Decoded message with missing required fields
                console.log("Test 2.1 " + JSON.stringify(feed, null, 4));
                console.log("Test 2.2 " + feed);
            } else { // General error
                console.log(e);
            }
        }


    });
});
req.on('error', function(e) {console.log('ERROR: problem with request: ' + e.message);});
req.end();
setTimeout(readGtfs, configFrequencySeconds*1000);
}

I will try your test cases and I will let you know.

@dcodeIO
Copy link
Member

dcodeIO commented Dec 28, 2013

The problem is this:

if (!data) data = chunk; else data += chunk; 

This converts data to a string from a Buffer object. Never ever do this if the Buffer's data is not an utf8 encoded string! Instead try:

function readGtfs() {
    var data = []; // List of buffers
    var req = http.request(configServer, function(res) {
        res.on('data', function (chunk) {
                data.push(chunk); // Add buffer chunk
        });
        res.on('end', function (){ 
            data = Buffer.concat(data); // Make one large buffer of it
            try {
                var feed = myDecoder.decode(data); // And decode it
...

This way, no string conversion happens and YourMessage#decode should work (no need for base64).

I've also updated the FAQ: https://github.com/dcodeIO/ProtoBuf.js/wiki/How-to-read-binary-data-in-the-browser-or-under-node.js%3F

@saccodd
Copy link

saccodd commented Dec 28, 2013

It works perfectly!!!
Your ProtoBuf.js is awesome! Thank you very much for your support also.
If my project evolves well I will let you know.
Daniele

@dcodeIO
Copy link
Member

dcodeIO commented Dec 28, 2013

You are welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants