Changing Dataset #4

camelot2002 · 2021-08-01T17:52:33Z

I wanted to change the data set but am unable to understand how you have mapped document_ids to the documents. A little clarification of that in readme.md would be really helpful.
Thank you.

maifeng · 2021-08-01T19:16:26Z

The document ids are either unique IDs provided by the data vendor or they can be incremental IDs. If you have a CSV file with no other unique identifiers, you can save the row numbers as the document IDs.

camelot2002 · 2021-08-01T19:41:51Z

i dont have a csv file all i have is the data

camelot2002 · 2021-08-01T19:44:32Z

i have a ticker to differentiate different companies. But in your csv files one document has multiple document ids and i dont understand how a document has been broken down.

maifeng · 2021-08-01T20:12:52Z

One input document corresponds to one unique id. The number of rows in document file is the same as the document-id file.

camelot2002 · 2021-08-01T20:28:50Z

the document.txt in the input folder contains several documents right? and each line has a unique id okay. And also each document has a unique id. How does it differentiate between different documents in that plethora of text.

maifeng · 2021-08-01T22:37:02Z

Each line in document.txt is a unique document with line breaks removed.

camelot2002 · 2021-08-02T04:52:39Z

okay thank you.

maifeng closed this as completed Aug 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing Dataset #4

Changing Dataset #4

camelot2002 commented Aug 1, 2021

maifeng commented Aug 1, 2021

camelot2002 commented Aug 1, 2021

camelot2002 commented Aug 1, 2021 •

edited

Loading

maifeng commented Aug 1, 2021

camelot2002 commented Aug 1, 2021 •

edited

Loading

maifeng commented Aug 1, 2021

camelot2002 commented Aug 2, 2021

Changing Dataset #4

Changing Dataset #4

Comments

camelot2002 commented Aug 1, 2021

maifeng commented Aug 1, 2021

camelot2002 commented Aug 1, 2021

camelot2002 commented Aug 1, 2021 • edited Loading

maifeng commented Aug 1, 2021

camelot2002 commented Aug 1, 2021 • edited Loading

maifeng commented Aug 1, 2021

camelot2002 commented Aug 2, 2021

camelot2002 commented Aug 1, 2021 •

edited

Loading

camelot2002 commented Aug 1, 2021 •

edited

Loading