Cutted _version_ field #71

jaceksocha · 2013-09-25T10:58:05Z

Currently search method returns data from SOLR which is parsed by JSON.parse(). Result of this parsing is truncated "version" field, because "version" occurs in solr as long type, which is parsing to int.
The solution of this bug may be an alternative parser to JSON.parse() (for example: https://npmjs.org/package/json-bigint)
or switch to return raw data (without parsing).

Is there any possibility to patch this issue?

lbdremy · 2013-10-02T22:40:14Z

Hi @githubjs,

Could you provide me the JSON response from Solr that JSON.parse is not able to parse.

Thanks

marc-portier · 2014-08-12T06:32:56Z

I just ran into this myself, and reported in solr land as https://issues.apache.org/jira/browse/SOLR-6364 - and was just coming over here to suggest what @githubjs has already done in this issue :)

You can see this at work by adding the option { "versions" : "true" } during a solrclient.add()
The response will then hold a section "adds" which is an array holding

[0] the id of the created document and
[1] the assigned version --> current json parsing will always round that number to a multiple of 100 (ie loosing the last two digits of precision)

(almost funny: the solr4 web-ui uses javascript in the browser and suffers from the same issue, you really need curl cli commands - or anything not passing javascript Number limitations to see the actual not rounded numeric value)

The point is that large numbers (e.g. in version field of solr4) are "acceptable" for the json spec, but are truncated/rounded in javascript (the issue mentioned holds all the references)

Replacing the classic JSON.parse and stringify with the one from the mentioned json-bigint package should do the trick.

I'm volunteering to squeeze this into the v0.3.x since I need this anyway - my other option is to switch to XML messaging towards SOLR which is way less elegant IMHO

Even if the mentioned solr-issue (for now) suggests some future where this is fixed through a " "json.long" parameter that can force string responses, it is beneficial for our client library to be able to work without it and thus support deployments with existing SOLR 4.x out there.

I'm not sure yet what the json-bigint effect will be on numbers that are not too big, and if there is any danger in breaking our own api/contract to users of this library. If that is the case I suggest we introduce some init-creation-time option indicating to deliberately use this json-big-int.

Any other thoughts or suggestions?

marc-portier · 2014-08-13T11:32:35Z

hm, this is a bumpy road:

introducing the json-bigint I run into this: sidorares/json-bigint@4fd9752#commitcomment-7370451

this makes the ping-test fail because for no apparent reason the admin/ping response holds the key 'echoParams' twice: see also https://issues.apache.org/jira/browse/SOLR-6369

I'm waiting some to see how json-bigint and solr communities are going to respond.
Without their help the only (fast) way out I see is to force using classic json-parse in the ping context

While waiting for external reaction I'm adding some tests to really force this problem when not parsing the version correctly

marc-portier · 2014-08-13T11:36:24Z

Quicky: adding &omitHeader=true in ping context avoids the responseHeader section holding the duplicate keys... -- will use this to get past tests for now.

Introducing json-bigint to properly handle the too-big-number issue in classic json parsing. Adding the test-cases that show the need for this in regular solr fields of type long (_l suffix) and the fact that the solr 4.x optimistic locking feature (ie the _version_ field) needs this badly. Caveat: json-bigint is currently more restrictive on json parsing then the classic JSON.parse Namely: it renders an error on duplicate key fields in the json. Since the response-header of the ping in solr introduces such a duplicate key this commit chooses to force omitHeader=true for the ping request. This can be reconsidered if json-bigint would allow for a more lenient parse() in line with the classic one.

marc-portier · 2014-08-13T18:44:42Z

Some additional notes:

just realized my branch for this fix https://github.com/marc-portier/solr-node-client/tree/v0.3.x-patch71 also holds the fix for Support the indexing of unstructured (extracted) content from files (in solr known content-types) #90 - @lbdremy: let me know if you want those in separate pull-reqs versus the main v0.3.x
Just received a reaction from solr land: refering to http://wiki.apache.org/solr/SolJSON#JSON_specific_parameters - seems they have a json.nl=arrarr option to serialize json as an array-of-arrays. That avoids duplicate-keys, something which can also pop-up at other places in json-replies apparently (not only in ping-response-header). Unsure if we want to go down that path? Would surely change the response-layout our clients are producing, so I guess we should not do that by default, but only by specifying an option do to so during client-instantiation.

In the follow up comments it also states we could avoid the duplicate-key in the ping response-header by changing the config (duplication seems to come from defaults mentioned in our test/materials/solrconfig.xml)

1086     <lst name="defaults">
1087       <str name="echoParams">all</str>
1088     </lst>

Removing that is another quick out - but could lead to more unsatisfied users of our client not understanding why the client.ping() doesn't work versus their fairly standard solrconfig.xml

My current hope remains that json-bigint will want to make their parse() more lenient so it can (better) function as a drop-in replacement for classic JSON.parse

Basically I'm (currently) fine with this patch as it is. I see some alternative routes to proceed:

accept as is and merge into v0.3.x - give or take some formal things I've maybe overlooked
possibly additionally follow-up with some reminder-issues to (somewhere in the future) consider initialisation options to allow configuring behavioral settings like json-parser to use, arrarr json options to use and whatnot more of that?
possibly accompanied with some hope on a reasonably quick fix from the json-bigint we can include
possibly combined with the attitude/agreement not to release v0.3.x until some/most/all of those new related issues would be solved too

OR

not accept until all is resolved and all dust has landed, but that will require some more upfront discussion and planning: IMHO that might not be the fastest route to pragmatically resolving issues as we go allong (just my 2c)

lbdremy · 2014-08-14T15:51:57Z

Wow ok that's a lot of stuff happening.

So, I think the right approach is to use json-bigint module as a drop-in replacement for JSON.parse like you said, the parse method of the json-bigint module should behave like the standard JSON lib concerning duplicate keys. (i.e the last value is used for the duplicate key)

We should "advertise" this problem regarding duplicate keys where we know it is relevant and suggest to the user to use the option you mentionned in the query json.nl=arrarr,in case they are interested by the two or more values.

marc-portier · 2014-08-14T16:05:20Z

Yep.
I have submitted an issue and fix in the json-bigint project - no idea when it is going to land, hoping on "soon" - See sidorares/json-bigint#6

As suggested I would like us to advance with the current state of things and put down some new issues to keep watching out for the cleanup/outcome in other regions later on. Those issues could be about:

adding documentation as you suggest about json.nl=arrarr and stuff.
upgrading to new json-bigint when it is there + revert the omitHeader trick on ping and tests (rather then include our corrected fork of json-bigint)
... and/or any other follow-up-issues you think off

Am already running the fixed libs in my project, to great satisfaction :)
Anyway, I'm eager to follow up on all sides getting it right eventually so I'm not off on a solitary fork of myself - so you easily assign some of those tasks to me.

lbdremy · 2014-08-14T16:36:50Z

Great, we could use your fork of json-bigint until your patch is landed into the original project.

I think on Saturday I'm gonna write some more tests and publish 0.3.0 with what is there (there is already plenty of nice stuff now), new features and bug fixes will be published as patch version on 0.3, like that you and other users can start using 0.3 from npm repo.

sidorares · 2014-08-15T06:10:58Z

@lbdremy published as json-bigint@v0.1.0

… in json-bigint we can re-introduce the headers in the ping response and return to the classic assertion scheme in the ping-tests. This should nicely roundup issue lbdremy#71.

lbdremy · 2014-08-16T16:49:41Z

thanks @sidorares

@marc-portier

* NEW: Add test suite to check the utility function escapeSpecialChars. * NEW: Add two utilities `escapeSpecialChars` and `escapeAndEncode`. * NEW: Escape specials chars part of the query syntax in Lucence in the query values. * NEW: Expose `format.escapeSpecialChars` through the instance of `Client`. * NEW: Add an example of use of facets. * FIX: Use a string in `q` instead of an object because special characters in the values of each key of an object are now escaped as expected. * FIX: Escape characters now escape by the solr client. * FIX: Remove waitFlush option from the command optimize in the test. This parameter in the command optimize has no effect since Solr 1.4 and has beeen definitively been removed in Solr 4.0. * FIX: Remove waitFlush in the mocked response. * FIX: Use dynamic type for last_update field instead. Avoid to declare the field last_update as a Date in the schema.xml. * FIX: Escape whitespace character. * FIX: Escape space character with double quotes in the delete by query test. * FIX: Escape special characters and white space in the value given in Client.delete(field,value,cb) * NEW: Add debugQuery method to the Query prototype. * BREAKING CHANGE: Return http.ClientRequest for all commands. Make more sense to return the HTTP client request instead of the current solr client, because first chain commands is not valuable and it allows to abort the request if wanted. * NEW: Use mocha and chai for testing instead of vows. * NEW: Ignore old test folder. * NEW: Rewrite a big part of the test suite with mocha. Some test are not yet written, but that's coming. * NEW: Add shorthand method deleteAll. Close lbdremy#35. * FIX: Remove file commitWithin-test.js * NEW: Implement shorthand command prepareCommit. Close lbdremy#37. * NEW: Implement shorthand method Client#softCommit. Close lbdremy#36. * FIX: Remove tests not yet implemented properly. * NEW: Move the paths of the request handlers in Client. Close lbdremy#34. * NEW: Add instruction before_script, the script travis-solr will install Solr before running the tests. Close lbdremy#33. * FIX: Remove old version of node 0.4 and unstable version 0.7. * FIX: Remove deprecated parameter when commiting with some options. * FIX: Use .md instead of .markdown. * NEW: Add some instructions about the contribution process. * FIX: Format correctly Roadmap section * FIX: Client#addRemoteResource was creating wrong path /solr//update. Close lbdremy#42 * NEW: Add schema.xml and solrconfig.xml used during the test suite in the Solr instance. Close lbdremy#50. * FIX: Copy the right content from the solrconfig.xml file. * FIX: Missing closing bracket. * Add usage example for client.softCommit. * Remove test/test-util.js should have been removed during the merge of master. * FIX: Add back cvs-stream module as dev dependency. * FIX: Update mocha and chai to the latest version, remove nock module from the dependency. * FIX: Remove out of date comment regarding nock and update licence. * NEW: Facilitate usage of the test coverage tool. * FIX: Undo implicit escaping of Lucene special characters made via overriding of querystring.escape. Close lbdremy#51 and close lbdremy#44. * FIX: Change the url of the solr installer on travis to use new github domain for raw content. * FIX: Use ~ instead of ^ in the version declaration to be able to install the dev dependencies with older npm. * NEW: Make the solr connection parameters used by the test suite configurable. Either via a configuration file called config.json in the root or via command line arguments thanks to figc module, see README.md for more informatio$ Close lbdremy#85 * NEW: Implement shorhand Client#searchAll(), close lbdremy#38. * FIX: Dry Client#deleteAll(). * NEW: Implement and test realtime get feature, Client#get(), close lbdremy#88 The test written is suggested in https://wiki.apache.org/solr/RealTimeGet * NEW: Rename Client#get by Client#realTimeGet, use Client#get to do arbitrary HTTP GET request. * FIX: Dry Client#spell, Client#search, Client#ping, Client#realTimeGet. * NEW: Add support for deep-paging via Query#cursorMark, close lbdremy#89. * NEW: Implement test for core, facet, group, mlt queries. ref lbdremy#89 * NEW: Add command to run plato, a static analysis and complexity tool. * REFACTOR: Make method signatures of all methods using Client#get consistent regarding the query argument. They all accept a query argument that can be either a String, a Query or a Object, also the query is not optional. * REFACTOR: Dry Client#deleteByRange by using Client#deleteByQuery. * FIX: Dry a bit getJSON and postJSON private functions and fix multi-byte characters bug. By setting the encoding on the response object, see http://nodejs.org/api/stream.html#stream_readable_setencoding_encoding, close lbdremy#81.' * FIX: Declare httperror version the old way for node 0.8. * FIX: Update httperror module to work on node 0.8. * FIX: Use json-bigint module to properly handle the too-big-number issue in classic json parsing. Add the test-cases that show the need for this in regular solr fields of type long (_l suffix) and the fact that the solr 4.x optimistic locking feature (ie the _version_ field) needs this badly. Close lbdremy#71 Conflicts: lib/solr.js * NEW: Update json-bigint to the latest version 0.1.3. * NEW: Update JSONStream and request modules to the latest version. * NEW: Add support for HTTPS, close lbdremy#111. Usage: var client = solr.createClient({ secure : true}); * NEW: Update and add examples for Client#{get, softCommit, prepareCommit, realTimeGet, spell}. * FIX: Update README to reflech recent changes made on the library. * RELEASE: v0.3.0, big thanks to @marc-portier . * NEW: Add notes regarding migration between 0.2.x and 0.3.x . * NEW: Added atomicUpdate() to partially update documents. * FIX: Move up comment related to test coverage into its section. * BREAKING CHANGE: JSONbig is not anymore the default json serializer/deserializer, native JSON lib is used instead. In order to use the JSONbig serializer/deserializer, initialize the client with bigint to true i.e solr.createClient({ bigint : true}) or via client.options.bigint = true. The reason for this change is that JSONbig has a significant performance impact when manipulating "large" chunk of data. * NEW: Update request module to version 2.42.0. * NEW: Run the test suite against Solr version 4.9.0 on Travis. * NEW: Run the test suite with node.js version 0.11.x too. * NEW: Configure npm test command to run the test suite first with JSON native then again with JSONbig. * NEW: Mention why/how to use JSONbig. * NEW: Pass a whole new object containing required properties to postJSON and getJSON functions instead of a reference to Client#options. This allow properties to be static across the whole request/response cycle, particularly interesting for the property bigint which is quite subject to change between requests and since bigint value is used to determine JSON (de)serializer so at request time and response time you see where I am going! * NEW: Add migration guide between 0.3.x and 0.4.x. * RELEASE: v0.4.0 * Added Gitter badge * FIX: Remove format.escapeAndEncode method, and replace with encodeURIComponent where appropriate * NEW: Update the version of Solr to 4.9.1 used in travis.ci. * FIX: Only escape && and || and don't escape spaces * FIX: Update request module to the latest version available 2.49.0. * FIX: Use molivawe travis-solr script before multicore option was a default to start Solr. * NEW: Add migration instructions between 0.4.x and 0.5.x. * RELEASE: v0.5.0 * * Added post method handler * Switched search to post for super big queries (e.g. anything over 2048 chars) * * Updated post handler to send and encode parameters accurately * * Updated post to compile parameters properly * * Removed console debugging * * Added function docs * * Unnecessary whitespace change * * Prefer GET for searching, but for large queries use POST * Added a method in the query class to set the default field to search. Also added a simple test for it. * Removed test coverage on NodeJs v 0.8 * added solrVersion option to client constructor. Fixes lbdremy#142 * Config Cleanup + Feedback Integration * `get_max` => `get_max_request_entity_size` * Removed local test core from the config (cherry picked from commit ba9fa70) * Added support for Solr5 query highlighting. Fixes lbdremy#130 * Added support for Collection admin tasks. Only Create is supported for now. * Made filename singular to match the rest of the naming scheme * changed name of collections.js to collection.js * trying to make the client.createCollection() function work. * Added collection RELOAD and DELETE. Finished testing, fixed bugs. These features all seem to work as intended now. * Turned off auto-posting by default. * Can opt-in to auto-posting by setting `get_max_request_entity_size` to a number. (e.g. 2048) * Changed URL length check to be UTF8-friendly * renamed client.adminExecute() to client.executeCollection() to make room for the possibility of client.executeCore() later. * Added all Solr v5.0 collection features. * added client.query(), added deprecation warning to client.createQuery() * Added collection creation test. Untested, waiting for a functional multi-node Solr cloud installation. * Changed name to better describe function. Add and delete tests are passing. * new directory structure to avoid tests failing due to solrCloud not running on test server. * Added more tests * Still more tests. All passing. * More tests passing, fixed some syntax. * Last test added. REBALANCELEADERS fails on my solr 5.0 as 'unknown action.' Need to re-run with Solr v5.4. * updated to ignore swp files. * removed swp file * Implement The TermsComponent Implement The TermsComponent Test need to modify solrconfig.xml http://wiki.apache.org/solr/TermsComponent * fix(queryHighlighting): change some query hl params to prevent future problems some numerical parameters could be specified as '0', causing them to be interpreted as 'false' by the Query.prototype.hl function. * test(queryHighlighting): test two simple, comprehensive test cases Two simple cases that include all options. In the future, these cases could be separated out into more typical use cases. * style: remove un-needed comment Comment was a to-do list for the second test case. Test case was added, comment wasn't removed. Removing to avoid later confusion. * chore(travisCI): add node v4.1.1 to build Node recently released a major update. Currently, tests pass with Node v4.1.1. Should continue to test a 4.XX version of Node. * Fix terms-query-test - Remove wiki page text - Change formatting - Change test code * Change terms-query-test formatting * Change terms-query-test Change terms-query-test because terms search dose not return "responseHeader.params". * feat: Update dependencies version to latest. Possibly breaking. * fix: Fix tests following npm dependency packages version update * feat: Add support for facet.pivot and facet.pivot.mincount (issue lbdremy#146) * feat(Facet.Pivot): Finish implementation of facet.pivot and facet.pivot.mincount + update according tests * feat(Utils): Add some function tooling to prepare for centralization of helpers, to ease a move to a dependency like lodash for example, and to clean/standardize the code. + tests * feat(Utils+Standardization): Replacing some pieces of code with standardized + centralized util helper * feat(VersionUtil): Add a helper to support the version checking so we can increase the feature coverage while still supporting older versions of Solr * feat(solrVersion): Adds solrVersion support and ability to pass a solrVersion to the Client so queries will use features that are only available to specific versions, without compromising on backward compatibility. * docs(Readme): Add some more information of how some new features work, add some examples, re-organized migration doc chronologically reverse to be more meaningful. * docs(Contributions): Update contributions page to something more accurate / up-to-speed with latest project directions * docs(example): query highlighting example Added minimal example for the Standard Query Highlighter * test(Client.prototype.spell): Simple response header test Test case will execute a simple query on the /spell query handler, check for '0' response status. * chore(Package.son): Bump package version in Package.json * docs(readme): fix name * test(rangeFilter): test array of multiple fields and obj of single field Fixes lbdremy#93 * add license to package.json for easy automated extraction * Only check options.pivot.mincount if options.pivot is available. * URI encode the cursorMark value before use URI encoding needed because cursorMark raw value can contain characters not allowed without uri encoding * Add support for ipv6 when resolving host/hostname There was no way of setting the "family" option for http.request() which is necesary when speaking with a server that only responds on ipv6. The new option "ipVersion" is now passed to http/https * Do not send the parameters in the url when using POST. * feat(Error): Properly capture and proxy status codes and error messages * Update delete.js * Remove client.terms - Remove client.terms - Update terms-query-test * Update solr - Change client.terms to client.temrsSearch * docs: better installation instructions. * * escaping fl parameter in Query * fix markdown * Added bluebird .promisifyAll to Client prototype functions * version bump * Update request dependency to current version Previous version has a known security vulnerability: https://nodesecurity.io/advisories/309 * Bump to 0.7.0 Bump the version to a new branch so we can test before releasing. * update npm dependencies. A critical vulnerability made this unusable. * Update travis to test on modern platforms.

lbdremy added the bug label Jul 14, 2014

marc-portier self-assigned this Aug 12, 2014

lbdremy added the resolved waiting v0.3 to close label Aug 16, 2014

marc-portier closed this as completed in 9eef5b0 Aug 16, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cutted _version_ field #71

Cutted _version_ field #71

jaceksocha commented Sep 25, 2013

lbdremy commented Oct 2, 2013

marc-portier commented Aug 12, 2014

marc-portier commented Aug 13, 2014

marc-portier commented Aug 13, 2014

marc-portier commented Aug 13, 2014

lbdremy commented Aug 14, 2014

marc-portier commented Aug 14, 2014

lbdremy commented Aug 14, 2014

sidorares commented Aug 15, 2014

lbdremy commented Aug 16, 2014

Cutted _version_ field #71

Cutted _version_ field #71

Comments

jaceksocha commented Sep 25, 2013

lbdremy commented Oct 2, 2013

marc-portier commented Aug 12, 2014

marc-portier commented Aug 13, 2014

marc-portier commented Aug 13, 2014

marc-portier commented Aug 13, 2014

lbdremy commented Aug 14, 2014

marc-portier commented Aug 14, 2014

lbdremy commented Aug 14, 2014

sidorares commented Aug 15, 2014

lbdremy commented Aug 16, 2014