❌ Errors in create_base_entity_graph #437

Borui66111 · 2024-07-08T13:26:48Z

I have successfully replicated the result for the official demo from get started.
This happens when I try on new data by extracting the textual info from a company's pdf report and storing textual information in input.txt. I have used UTF-8 encoding and the length of the document is rather short compared to the example.
This error is raised as seen in the log:

raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

Can anyone help me with this issue, please? I am not very familiar with the pipeline as well as the technical details behind this. It's more for me to explore at this time. Thx in advance!

The text was updated successfully, but these errors were encountered:

paid-ltd · 2024-07-08T19:44:45Z

im having this same issue

adoresever · 2024-07-09T01:10:07Z

me too

lifelmy · 2024-07-09T02:22:54Z

me too

paid-ltd · 2024-07-09T02:37:14Z

I did repo into the same directory and I got it working

…

On Tue, 9 Jul 2024, 14:23 lifelmy, ***@***.***> wrote: me too — Reply to this email directly, view it on GitHub <#437 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BHZFMU7REM73EBXXA3555BLZLNCRHAVCNFSM6AAAAABKQ3CEAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJWGI3TKMZUHE> . You are receiving this because you commented.Message ID: ***@***.***>

aviraen · 2024-07-09T08:16:14Z

same issue !!!! any solution

fryfry33 · 2024-07-09T08:29:19Z

same issue !!!! any solution

Try to modify the max_token . Worked for me (I have set it to 1700 for gpt 3.5)

aviraen · 2024-07-09T08:49:52Z

did you use gpt 3.5 turbo or same and i tried with 1200 max tokens but still not working

fryfry33 · 2024-07-09T09:10:11Z

I used gpt 3.5 turbo from azure

AlonsoGuevara · 2024-07-09T20:35:07Z

Hi, can you please inspect in any of the cache entries for entity extraction and paste a result? I suspect it is an entity extraction issue.

Borui66111 · 2024-07-10T23:59:08Z

Result from cache/entity_extraction/chat-d87d9cc79a03b34a16a6895b3d54f53a
{"result": "There is no text provided to analyze. Please provide a text document to proceed with the entity and relationship extraction.\n\n<|COMPLETE|>", "input": "\n-Goal-\nGiven a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.\n\n-Steps-\n1. Identify all entities. For each identified entity, extract the following information:\n- entity_name: Name of the entity, capitalized\n- entity_type: One of the following types: [organization,person,geo,event]\n- entity_description: Comprehensive description of the entity's attributes and activities\nFormat each entity as (\"entity\"<|><entity_name><|><entity_type><|><entity_description>\n\n2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.\nFor each pair of related entities, extract the following information:\n- source_entity: name of the source entity, as identified in step 1\n- target_entity: name of the target entity, as identified in step 1\n- relationship_description: explanation as to why you think the source entity and the target entity are related to each other\n- relationship_strength: a numeric score indicating strength of the relationship between the source entity and target entity\n Format each relationship as (\"relationship\"<|><source_entity><|><target_entity><|><relationship_description><|><relationship_strength>)\n\n3. Return output in English as a single list of all the entities and relationships identified in steps 1 and 2. Use **##** as the list delimiter.\n\n4. When finished, output <|COMPLETE|>\n\n######################\n-Examples-\n######################\nExample 1:\n\nEntity_types: [person, technology, mission, organization, location]\nText:\nwhile Alex clenched his jaw, the buzz of frustration dull against the backdrop of Taylor's authoritarian certainty. It was this competitive undercurrent that kept him alert, the sense that his and Jordan's shared commitment to discovery was an unspoken rebellion against Cruz's narrowing vision of control and order.\n\nThen Taylor did something unexpected. They paused beside Jordan and, for a moment, observed the device with something akin to reverence. \u201cIf this tech can be understood...\" Taylor said, their voice quieter, \"It could change the game for us. For all of us.\u201d\n\nThe underlying dismissal earlier seemed to falter, replaced by a glimpse of reluctant respect for the gravity of what lay in their hands. Jordan looked up, and for a fleeting heartbeat, their eyes locked with Taylor's, a wordless clash of wills softening into an uneasy truce.\n\nIt was a small transformation, barely perceptible, but one that Alex noted with an inward nod. They had all been brought here by different paths\n################\nOutput:\n(\"entity\"<|>\"Alex\"<|>\"person\"<|>\"Alex is a character who experiences frustration and is observant of the dynamics among other characters.\")##\n(\"entity\"<|>\"Taylor\"<|>\"person\"<|>\"Taylor is portrayed with authoritarian certainty and shows a moment of reverence towards a device, indicating a change in perspective.\")##\n(\"entity\"<|>\"Jordan\"<|>\"person\"<|>\"Jordan shares a commitment to discovery and has a significant interaction with Taylor regarding a device.\")##\n(\"entity\"<|>\"Cruz\"<|>\"person\"<|>\"Cruz is associated with a vision of control and order, influencing the dynamics among other characters.\")##\n(\"entity\"<|>\"The Device\"<|>\"technology\"<|>\"The Device is central to the story, with potential game-changing implications, and is revered by Taylor.\")##\n(\"relationship\"<|>\"Alex\"<|>\"Taylor\"<|>\"Alex is affected by Taylor's authoritarian certainty and observes changes in Taylor's attitude towards the device.\"<|>7)##\n(\"relationship\"<|>\"Alex\"<|>\"Jordan\"<|>\"Alex and Jordan share a commitment to discovery, which contrasts with Cruz's vision.\"<|>6)##\n(\"relationship\"<|>\"Taylor\"<|>\"Jordan\"<|>\"Taylor and Jordan interact directly regarding the device, leading to a moment of mutual respect and an uneasy truce.\"<|>8)##\n(\"relationship\"<|>\"Jordan\"<|>\"Cruz\"<|>\"Jordan's commitment to discovery is in rebellion against Cruz's vision of control and order.\"<|>5)##\n(\"relationship\"<|>\"Taylor\"<|>\"The Device\"<|>\"Taylor shows reverence towards the device, indicating its importance and potential impact.\"<|>9)<|COMPLETE|>\n#############################\nExample 2:\n\nEntity_types: [person, technology, mission, organization, location]\nText:\nThey were no longer mere operatives; they had become guardians of a threshold, keepers of a message from a realm beyond stars and stripes. This elevation in their mission could not be shackled by regulations and established protocols\u2014it demanded a new perspective, a new resolve.\n\nTension threaded through the dialogue of beeps and static as communications with Washington buzzed in the background. The team stood, a portentous air enveloping them. It was clear that the decisions they made in the ensuing hours could redefine humanity's place in the cosmos or condemn them to ignorance and potential peril.\n\nTheir connection to the stars solidified, the group moved to address the crystallizing warning, shifting from passive recipients to active participants. Mercer's latter instincts gained precedence\u2014 the team's mandate had evolved, no longer solely to observe and report but to interact and prepare. A metamorphosis had begun, and Operation: Dulce hummed with the newfound frequency of their daring, a tone set not by the earthly\n#############\nOutput:\n(\"entity\"<|>\"Washington\"<|>\"location\"<|>\"Washington is a location where communications are being received, indicating its importance in the decision-making process.\")##\n(\"entity\"<|>\"Operation: Dulce\"<|>\"mission\"<|>\"Operation: Dulce is described as a mission that has evolved to interact and prepare, indicating a significant shift in objectives and activities.\")##\n(\"entity\"<|>\"The team\"<|>\"organization\"<|>\"The team is portrayed as a group of individuals who have transitioned from passive observers to active participants in a mission, showing a dynamic change in their role.\")##\n(\"relationship\"<|>\"The team\"<|>\"Washington\"<|>\"The team receives communications from Washington, which influences their decision-making process.\"<|>7)##\n(\"relationship\"<|>\"The team\"<|>\"Operation: Dulce\"<|>\"The team is directly involved in Operation: Dulce, executing its evolved objectives and activities.\"<|>9)<|COMPLETE|>\n#############################\nExample 3:\n\nEntity_types: [person, role, technology, organization, event, location, concept]\nText:\ntheir voice slicing through the buzz of activity. \"Control may be an illusion when facing an intelligence that literally writes its own rules,\" they stated stoically, casting a watchful eye over the flurry of data.\n\n\"It's like it's learning to communicate,\" offered Sam Rivera from a nearby interface, their youthful energy boding a mix of awe and anxiety. \"This gives talking to strangers' a whole new meaning.\"\n\nAlex surveyed his team\u2014each face a study in concentration, determination, and not a small measure of trepidation. \"This might well be our first contact,\" he acknowledged, \"And we need to be ready for whatever answers back.\"\n\nTogether, they stood on the edge of the unknown, forging humanity's response to a message from the heavens. The ensuing silence was palpable\u2014a collective introspection about their role in this grand cosmic play, one that could rewrite human history.\n\nThe encrypted dialogue continued to unfold, its intricate patterns showing an almost uncanny anticipation\n#############\nOutput:\n(\"entity\"<|>\"Sam Rivera\"<|>\"person\"<|>\"Sam Rivera is a member of a team working on communicating with an unknown intelligence, showing a mix of awe and anxiety.\")##\n(\"entity\"<|>\"Alex\"<|>\"person\"<|>\"Alex is the leader of a team attempting first contact with an unknown intelligence, acknowledging the significance of their task.\")##\n(\"entity\"<|>\"Control\"<|>\"concept\"<|>\"Control refers to the ability to manage or govern, which is challenged by an intelligence that writes its own rules.\")##\n(\"entity\"<|>\"Intelligence\"<|>\"concept\"<|>\"Intelligence here refers to an unknown entity capable of writing its own rules and learning to communicate.\")##\n(\"entity\"<|>\"First Contact\"<|>\"event\"<|>\"First Contact is the potential initial communication between humanity and an unknown intelligence.\")##\n(\"entity\"<|>\"Humanity's Response\"<|>\"event\"<|>\"Humanity's Response is the collective action taken by Alex's team in response to a message from an unknown intelligence.\")##\n(\"relationship\"<|>\"Sam Rivera\"<|>\"Intelligence\"<|>\"Sam Rivera is directly involved in the process of learning to communicate with the unknown intelligence.\"<|>9)##\n(\"relationship\"<|>\"Alex\"<|>\"First Contact\"<|>\"Alex leads the team that might be making the First Contact with the unknown intelligence.\"<|>10)##\n(\"relationship\"<|>\"Alex\"<|>\"Humanity's Response\"<|>\"Alex and his team are the key figures in Humanity's Response to the unknown intelligence.\"<|>8)##\n(\"relationship\"<|>\"Control\"<|>\"Intelligence\"<|>\"The concept of Control is challenged by the Intelligence that writes its own rules.\"<|>7)<|COMPLETE|>\n#############################\n-Real Data-\n######################\nEntity_types: organization,person,geo,event\nText: None\n######################\nOutput:", "parameters": {"model": "gpt-4o", "temperature": 0.0, "frequency_penalty": 0.0, "presence_penalty": 0.0, "top_p": 1.0, "max_tokens": 4000, "n": null}}

It says no text provided, but I do have a txt file: root/input/input.txt

Hi, can you please inspect in any of the cache entries for entity extraction and paste a result? I suspect it is an entity extraction issue.

Also, regarding the max token length, which one are you referring to? Is it for llm (the default setting is 4000, and different for other tasks)?

same issue !!!! any solution

Try to modify the max_token . Worked for me (I have set it to 1700 for gpt 3.5)

CyanMystery · 2024-07-15T08:49:43Z

Hi, can you please inspect in any of the cache entries for entity extraction and paste a result? I suspect it is an entity extraction issue.

Can you provide some examples of your correct file output? thanks

springtiger · 2024-07-19T08:22:06Z

I used Ollama to deploy LLM locally, and then changed the model to qwen2, and the rest remained the same. But it doesn't work with glm4

github-actions · 2024-07-29T01:52:27Z

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

Maekfei · 2024-08-01T03:20:39Z

It is fine when I used qwen2.But turn to other custom model,it doesn't work

maverick001 · 2024-08-12T04:09:12Z

Same issue here, failed with both gpt-4o and gpt-4o-mini.
.\cache\entity_extraction is empty.
Pls advice if theres any fix or viable workaround.

Maekfei · 2024-08-12T04:17:36Z

Same issue here, failed with both gpt-4o and gpt-4o-mini. .\cache\entity_extraction is empty. Pls advice if theres any fix or viable workaround.
What is your embedding model? pls send me the detail log.

night666e · 2024-08-12T06:51:06Z

indexing-engine.log
logs.json
可以帮我看看我的问题吗，我用的是xinference的glm4和bge-m3模型

Maekfei · 2024-08-12T08:33:55Z

indexing-engine.log logs.json 可以帮我看看我的问题吗，我用的是xinference的glm4和bge-m3模型
这个报错我也一直没找到问题在哪应该是跟用的模型有关建议换个模型试试我之前也碰到过类似的问题换了模型之后就OK了

adoresever · 2024-08-12T08:36:07Z

你好，具体一点 LuMF ***@***.***>于2024年8月12日周一16:34写道：

…

indexing-engine.log <https://github.com/user-attachments/files/16578251/indexing-engine.log> logs.json <https://github.com/user-attachments/files/16578253/logs.json> 可以帮我看看我的问题吗，我用的是xinference的glm4和bge-m3模型这个报错我也一直没找到问题在哪应该是跟用的模型有关建议换个模型试试我之前也碰到过类似的问题换了模型之后就OK了 — Reply to this email directly, view it on GitHub <#437 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BIAPQF7VCLIE4CEYKVDJSQLZRBXQ3AVCNFSM6AAAAABKQ3CEAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBTGM4TCMJVGY> . You are receiving this because you commented.Message ID: ***@***.***>

github-actions · 2024-08-20T01:51:28Z

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

wangSirGH · 2024-08-22T09:00:51Z

我也遇到了相同的问题，求大佬帮助

night666e · 2024-08-28T09:38:39Z

indexing-engine.log logs.json 可以帮我看看我的问题吗，我用的是xinference的glm4和bge-m3模型这个报错我也一直没找到问题在哪应该是跟用的模型有关建议换个模型试试我之前也碰到过类似的问题换了模型之后就OK了

有什么好的推荐呢？我这个情况如果不改提示词的话是可以正常运行的，修改后我不论怎么修改依然会遇到这个问题

charlielu05 · 2024-08-29T05:15:27Z

indexing-engine.log logs.json 可以帮我看看我的问题吗，我用的是xinference的glm4和bge-m3模型

In your logs it's failing at output_df[[level_to, to]] = pd.DataFrame(graph_level_pairs, index=output_df.index) but I can't find that line of code in the latest version 0.3.2.
The closest there is output_df[[level_to, to]] = pd.DataFrame( output_df[to].tolist(), index=output_df.index )

pimooook · 2024-09-01T09:00:27Z

I have the same issue with ollama and qwen2.
I found that the default num_ctx=2048 is too small to produce the right response.
Then I solved the problem by setting the num_ctx=32000 of qwen2. It works for me.

github-actions · 2024-09-09T01:57:25Z

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

github-actions · 2024-09-15T02:01:30Z

This issue has been closed after being marked as stale for five days. Please reopen if needed.

dvdtoth mentioned this issue Jul 8, 2024

[Bug] "ValueError: Columns must be same length as key" - Entity extraction fails due to invalid format returned by API #443

Closed

github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Jul 29, 2024

natoverse removed the stale Used by auto-resolve bot to flag inactive issues label Jul 30, 2024

natoverse added the awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response label Aug 6, 2024

github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Aug 20, 2024

github-actions bot removed the stale Used by auto-resolve bot to flag inactive issues label Aug 23, 2024

github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Sep 9, 2024

github-actions bot added the autoresolved label Sep 15, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❌ Errors in create_base_entity_graph #437

❌ Errors in create_base_entity_graph #437

Borui66111 commented Jul 8, 2024 •

edited

Loading

paid-ltd commented Jul 8, 2024

adoresever commented Jul 9, 2024

lifelmy commented Jul 9, 2024

paid-ltd commented Jul 9, 2024 via email

aviraen commented Jul 9, 2024

fryfry33 commented Jul 9, 2024

aviraen commented Jul 9, 2024

fryfry33 commented Jul 9, 2024

AlonsoGuevara commented Jul 9, 2024

Borui66111 commented Jul 10, 2024 •

edited

Loading

CyanMystery commented Jul 15, 2024

springtiger commented Jul 19, 2024

github-actions bot commented Jul 29, 2024

Maekfei commented Aug 1, 2024

maverick001 commented Aug 12, 2024

Maekfei commented Aug 12, 2024

night666e commented Aug 12, 2024

Maekfei commented Aug 12, 2024

adoresever commented Aug 12, 2024 via email

github-actions bot commented Aug 20, 2024

wangSirGH commented Aug 22, 2024

night666e commented Aug 28, 2024

charlielu05 commented Aug 29, 2024

pimooook commented Sep 1, 2024

github-actions bot commented Sep 9, 2024

github-actions bot commented Sep 15, 2024

❌ Errors in create_base_entity_graph #437

❌ Errors in create_base_entity_graph #437

Comments

Borui66111 commented Jul 8, 2024 • edited Loading

paid-ltd commented Jul 8, 2024

adoresever commented Jul 9, 2024

lifelmy commented Jul 9, 2024

paid-ltd commented Jul 9, 2024 via email

aviraen commented Jul 9, 2024

fryfry33 commented Jul 9, 2024

aviraen commented Jul 9, 2024

fryfry33 commented Jul 9, 2024

AlonsoGuevara commented Jul 9, 2024

Borui66111 commented Jul 10, 2024 • edited Loading

CyanMystery commented Jul 15, 2024

springtiger commented Jul 19, 2024

github-actions bot commented Jul 29, 2024

Maekfei commented Aug 1, 2024

maverick001 commented Aug 12, 2024

Maekfei commented Aug 12, 2024

night666e commented Aug 12, 2024

Maekfei commented Aug 12, 2024

adoresever commented Aug 12, 2024 via email

github-actions bot commented Aug 20, 2024

wangSirGH commented Aug 22, 2024

night666e commented Aug 28, 2024

charlielu05 commented Aug 29, 2024

pimooook commented Sep 1, 2024

github-actions bot commented Sep 9, 2024

github-actions bot commented Sep 15, 2024

Borui66111 commented Jul 8, 2024 •

edited

Loading

Borui66111 commented Jul 10, 2024 •

edited

Loading