Question
Does the ElasticsearchDocumentStore support nested data types . If so how could I select a nested field as search field or text field
Additional context
- I am able to retrieve documents using document_store.get_all_documents_in_index('document_index')
In [6]: print(document_store.get_document_count())
...: for doc in document_store.get_all_documents_in_index('document_index'):
...: pp.pprint(doc)
...: break
...:
...:
11196
{ '_id': 'qWwo9nIB7CZChbLK4FjC',
'_index': 'document_index',
'_score': None,
'_source': { 'actor_type': 'content',
'media': [ { 'actor_id': [],
'actor_type': 'content',
'body': 'FTS International Has Fragility '
'In The Short-Term; Balance Sheet '
'Remains Vulnerable (NYSE:FTSI) '
'Its high leverage ratio is a '
'significant risk factor in the '
'current environment.\n'
'\n'
...
'linked_concept_search_id': [342],
'locations': None,
'media_type': 'rss',
'meta': [ { 'key': 'link',
'value': 'https://seekingalpha.com/article/4352427-fts-international-fragility-in-short-term-balance-sheet-remains-vulnerable'}],
'similar_dictionaries': [],
'sql_handle_id': None,
'sql_media_id': 'gXVpzzMzkQ',
'tags': [ 231027,
233849,
233408,
231786,
231102,
231124,
231857,
233795
],
'title': 'FTS International Has Fragility '
'In The Short-Term; Balance Sheet '
'Remains Vulnerable '
'(NYSE:FTSI)'}]},
'_type': 'actor',
'sort': [2]}
- document_store.get_all_documents() gives a KeyError:
In [7]: document_store.get_all_documents()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-7-0571d29188b7> in <module>
----> 1 document_store.get_all_documents()
~/miniconda3/lib/python3.7/site-packages/haystack/database/elasticsearch.py in get_all_documents(self)
153 def get_all_documents(self) -> List[Document]:
154 result = scan(self.client, query={"query": {"match_all": {}}}, index=self.index)
--> 155 documents = [self._convert_es_hit_to_document(hit) for hit in result]
156 return documents
157
~/miniconda3/lib/python3.7/site-packages/haystack/database/elasticsearch.py in <listcomp>(.0)
153 def get_all_documents(self) -> List[Document]:
154 result = scan(self.client, query={"query": {"match_all": {}}}, index=self.index)
--> 155 documents = [self._convert_es_hit_to_document(hit) for hit in result]
156 return documents
157
~/miniconda3/lib/python3.7/site-packages/haystack/database/elasticsearch.py in _convert_es_hit_to_document(self, hit, score_adjustment)
285 document = Document(
286 id=hit["_id"],
--> 287 text=hit["_source"][self.text_field],
288 external_source_id=hit["_source"].get(self.external_source_id_field),
289 meta=meta_data,
KeyError: 'media.full_body'
- I have tried using a flattened notation and no results:
In [1]: from haystack.retriever.sparse import ElasticsearchRetriever
...: from haystack.database.elasticsearch import ElasticsearchDocumentStore
In [2]: document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", text_field="media.full_body", index="vopak-monitoring",name_field="media.title", sea
...: rch_fields=["media.full_body", "media.title"], create_index=False)
...: retriever = ElasticsearchRetriever(document_store=document_store)
...: res = retriever.retrieve("Energy")
...: print(res)
[]
- I also tried using a custom query without succes.
┆Issue is synchronized with this Jira Task by Unito
Question
Does the ElasticsearchDocumentStore support nested data types . If so how could I select a nested field as search field or text field
Additional context
┆Issue is synchronized with this Jira Task by Unito