Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

issues when indexing hash of hashes #88

Open
sakrafd opened this Issue Oct 26, 2010 · 9 comments

Comments

Projects
None yet
5 participants

sakrafd commented Oct 26, 2010

I have a couchdb doc with the following attribute (a hash containing another hash):

"subscriptions": {
"2999": {
"subscription_type_id": 1,
"source_keyword_id": 23629
},
"1668": {
"subscription_type_id": 1,
"source_keyword_id": 3099
}
}

My fulltext search function always returns undefined for the inner hash (i.e. I can't access doc.subscriptions["2999"].subscription_type_id). Here is the relevant part:

        if (doc.subscriptions && doc.subscriptions != null) {
          for (var sc in doc.subscriptions) {
            subscription_campaign_id = parseInt(sc);
            if (doc.subscriptions[sc] != null) {
              log.info('subscription_type_id' + doc.subscriptions[sc].subscription_type_id);  
              result.add(subscription_campaign_id, {'field':'subscription_campaign_id', 'type':'long'});
              result.add(doc.subscriptions[sc].subscription_type_id, {'field':'subscription_type_id', 'type':'long'});
              result.add(doc.subscriptions[sc].source_keyword_id, {'field':'source_keyword_id', 'type':'long'});
            } else {
              log.info('doc.subscriptions[sc] is null');
            }
          }
        }

This same code works fine in javascript, so it seems like the problem is somewhere in couchdb-lucene. Let me know if you would like the full doc and search function.

Thanks,
Dave

Owner

rnewson commented Oct 26, 2010

Anything in the logs? I threw a simple test together to see if nested structures can be emitted and it passes;

@Test
public void testNested() throws Exception {
    final String fun = "function(doc) { var ret = new Document(); ret.add(doc.foo[\"bar\"]); return ret; }";
    final DocumentConverter converter = new DocumentConverter(context,
            view(fun));
    final Document[] result = converter.convert(
            doc("{_id:\"hi\", foo: { bar: \"baz\"}}"), settings(), null);
    assertThat(result.length, is(1));
    assertThat(result[0].get("default"), is("baz"));
}

sakrafd commented Oct 26, 2010

No, just my log statements:

2010-10-26 16:41:30,386 INFO [lucene_test] Indexing from update_seq 0
2010-10-26 16:41:30,454 INFO [JSLog] doc.subscriptions[sc] is null
2010-10-26 16:41:30,455 INFO [JSLog] doc.subscriptions[sc] is null
2010-10-26 16:41:45,406 INFO [lucene_test] View[digest=a0ww492t5g4b1vhy7i2mhpl9w] now at update_seq 21

Owner

rnewson commented Oct 26, 2010

It seems to be because your keys are numbers.

sakrafd commented Oct 26, 2010

That seems to be it. I'm actually saving the keys as a string (and the source from couchdb shows them as strings) so

"subscriptions": {
"a2999": {
"subscription_type_id": 1,
"source_keyword_id": 23629
},
"b1668": {
"subscription_type_id": 1,
"source_keyword_id": 3099
}
}

works, but

"subscriptions": {
"2999": {
"subscription_type_id": 1,
"source_keyword_id": 23629
},
"1668": {
"subscription_type_id": 1,
"source_keyword_id": 3099
}
}

doesn't. When I log the typeof(key) for the "2999" it sees it as a string. Any idea where it is getting converting back to an int?

Owner

rnewson commented Oct 27, 2010

This is where the json string from couchdb is converted to an object in couchdb-lucene;

final JSONObject json = JSONObject.fromObject(line);

I'll check tomorrow but perhaps this is where the string is being converted to a int?

Owner

rnewson commented Feb 14, 2011

the json library has changed recently, is this still an issue?

I also see the issue with a hash where keys are number (though presented as strings in couchdb). Note that couchapps work perfectly well with this, so the problem is isolated to couch-lucene, probably 'lost in translation' between couch/java/javascript. I am using a fairly recent git clone (less than a week old).

I'm experiencing this as well with the current 0.10.0 snapshot from Git. I can iterate over objects with digit-containing properties, and log.info(typeof(key)) reports the property names are in fact strings. Trying to actually access those properties produces errors like WARN [test] foo caused TypeError: Cannot read property "length" from undefined (unnamed script#3).

I also discovered a workaround, at the start of your index function add:
doc = JSON.parse(JSON.stringify(doc));

Having exactly the same issue with numeric indexes – works perfect with text, but when it comes to numbers, values are inaccessible.

@dylantack you're are the man! It's the ugly workaround but by far more elegant than anything else we could do. Thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment