Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigCouch returns JSON array for sequence #1478

Closed
opie4624 opened this issue Nov 17, 2011 · 11 comments
Closed

BigCouch returns JSON array for sequence #1478

opie4624 opened this issue Nov 17, 2011 · 11 comments

Comments

@opie4624
Copy link

BigCouch is returning:

 "last_seq":[136374,"g1AAAAIReJzLYWBg4MxgTmGQT8pMT84vTc5wKC5ITc5MzMksLtFLrkyqzNHLyU9OzMkBKmRKZMhjYZA-XiiTlcTAYOJDtD6gLhmgLiBl3vfYAqTZdTtIswRcc0FaMjZdFkDlQCo4NDQUpMuOgRQrQ4C6gFQ-0GqQZsfvpGguAOoCUt2PLfpAmm0jiXJvD1A5kFq-atUqkC5jFlKsXAHUBaQOF8ocB4fvEVI0HwHqAlL3gSEG0uygS5R7HwCVA6n_QADSZZqXBQCZGar4"]}

elasticsearch is requesting:

 /my_db/_changes?feed=continuous&include_docs=true&heartbeat=10000&since=%5B136374%2C+g1AAAAIReJzLYWBg4MxgTmGQT8pMT84vTc5wKC5ITc5MzMksLtFLrkyqzNHLyU9OzMkBKmRKZMhjYZA-XiiTlcTAYOJDtD6gLhmgLiBl3vfYAqTZdTtIswRcc0FaMjZdFkDlQCo4NDQUpMuOgRQrQ4C6gFQ-0GqQZsfvpGguAOoCUt2PLfpAmm0jiXJvD1A5kFq-atUqkC5jFlKsXAHUBaQOF8ocB4fvEVI0HwHqAlL3gSEG0uygS5R7HwCVA6n_QADSZZqXBQCZGar4%5D

It looks like elasticsearch isn't expecting bigcouch's use of json arrays as a sequence identifier. The string doesn't appear to be quoted when it's passed back in the &since= parameter.

@kimchy kimchy closed this as completed in 8e9f01a Nov 18, 2011
@kimchy
Copy link
Member

kimchy commented Nov 18, 2011

I pushed a fix for this in both 0.18 and master branch, it would be great if you can test it with bigcouch.

@opie4624
Copy link
Author

Pulling and compiling now... I'll let you know in a bit.

@opie4624
Copy link
Author

No dice. It can't consume the _changes stream at all:


[2011-11-18 11:00:45,540][WARN ][river.couchdb            ] [Ravage 2099] [couchdb][stream-data] failed to executefailure in bulk execution:
[100]: index [_river], type [stream-data], id [_seq], message [MapperParsingException[Failed to parse [couchdb.last_seq]]; nested: NumberFormatException[For input string: "g1AAAAH-eJyV0M0NgkAQBeDx52IHJpLYAVc52oNSwLKiLNkICJrQgiBtaGhDI1qFJhSijzXRgx7WvbzL-zKzI4mo53VmZDhiwYM198bhSmxY4po8dVJpyoAzKdFqM1p2aVBFhu8QtbIG9T9ozr8A6gbqiFFRW0pdtJSFOmJi27ZSZ70FQaYgiABzlcwbOXzLOHS5YFLEyS8cQiGy2ioU3motm6OO2JdlqdRVe9kDCOIYGZWSO615J9QRNxzpz3l3EMQD7_U__wndgqMv"]; ]
[2011-11-18 11:00:45,637][DEBUG][action.bulk              ] [Ravage 2099] [_river][0] failed to bulk item (index) index {[_river][stream-data][_seq], source[{"couchdb":{"last_seq":[3900,"g1AAAAH-eJzLYWBg4MxgTmGQScpMT84vTc5wKCjKLEssSdVLrkyqzNHLyU9OzMkBqmJKZMhjYZA-XiiTlcTAwDgVpEkCoSktGUMDULkMUDmQMu97bAHWdYUoXRZA5UAqODQ0FKzrEnEOBGoJAWoBUvlAe8E6p4F0ysN1FhekJmcm5mQWl2DTXADUBaS6H1v0keDFHqByILV81apVYF3XiHbsCqAWIHW4UOY4WOd0ouw7AlQOpO4DAwms6yrR9j0AagFS_4EArHNKFgA3I6OT"]}}]}
org.elasticsearch.index.mapper.MapperParsingException: Failed to parse [couchdb.last_seq]
    at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:308)
    at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:577)
    at org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:565)
    at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:435)
    at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:491)
    at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:433)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:475)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:416)
    at org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:302)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:136)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:487)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:400)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.NumberFormatException: For input string: "g1AAAAH-eJzLYWBg4MxgTmGQScpMT84vTc5wKCjKLEssSdVLrkyqzNHLyU9OzMkBqmJKZMhjYZA-XiiTlcTAwDgVpEkCoSktGUMDULkMUDmQMu97bAHWdYUoXRZA5UAqODQ0FKzrEnEOBGoJAWoBUvlAe8E6p4F0ysN1FhekJmcm5mQWl2DTXADUBaS6H1v0keDFHqByILV81apVYF3XiHbsCqAWIHW4UOY4WOd0ouw7AlQOpO4DAwms6yrR9j0AagFS_4EArHNKFgA3I6OT"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:410)
    at java.lang.Long.parseLong(Long.java:468)
    at org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:68)
    at org.elasticsearch.index.mapper.core.LongFieldMapper.parseCreateField(LongFieldMapper.java:231)
    at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:295)
    ... 14 more

@kimchy
Copy link
Member

kimchy commented Nov 19, 2011

You need to delete the river first, since last_seq got identified as numeric (maybe you worked with local couchdb), and then changed to an array when you moved to bigcoudh.

@opie4624
Copy link
Author

I'm starting over with a fresh data directory and recreating the river
every time I test.

@kimchy
Copy link
Member

kimchy commented Nov 20, 2011

Can you set river.couchdb in logging.yml file to TRACE and run it? It seems like the seq stored under the _river index is identified as numeric for some reason. The trace level logging will show what gets indexed.

@opie4624
Copy link
Author

Log file is about 17MB: http://ge.tt/8QErZBA?c

@opie4624
Copy link
Author

https://gist.github.com/69c5cc6d2f214d3a5204

Looks like elasticsearch is parsing the json sequence info out properly for indexing, but is not storing it properly in the river/_seq document.

If I look at what's in elasticsearch there is no _seq document for the given river.

@kimchy
Copy link
Member

kimchy commented Nov 23, 2011

I see where the problem is, its because of the format of array, where it includes string and numeric types internally. Can I say again that this format BigCouch uses is quite atrocious...

@kimchy
Copy link
Member

kimchy commented Nov 23, 2011

ok, I pushed another effort in trying to solve it.

@opie4624
Copy link
Author

Getting the "failed to convert" error.
Log: https://gist.github.com/d215005dc57a77adee49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants