Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: 4 - fails to transform a large JSONL file #53

Closed
reichaves opened this issue Oct 5, 2022 · 3 comments
Closed

ERROR: 4 - fails to transform a large JSONL file #53

reichaves opened this issue Oct 5, 2022 · 3 comments

Comments

@reichaves
Copy link

Hello
Please, I have a very large .jsonl file (870MB) and I am using python 3.8 on Ubuntu
But twarc csv fails to transform to CSVs:

twarc2 csv results.jsonl tweets_with_attacks_journalists.csv

💔 ERROR: 4 Unexpected items in data!
Are you sure you specified the correct --input-data-type?
If the object type is correct, add extra columns with:
--extra-input-columns "edit_controls.is_edit_eligible,edit_controls.editable_until,edit_history_tweet_ids,edit_controls.edits_remaining"
Skipping entire batch of 9944 tweets!

Is there any other way to convert to CSV?

@SamHames
Copy link

SamHames commented Oct 5, 2022

Those columns are the very recently added fields related to tweet editing, twarc-csv doesn't support them yet.

Does using the suggested extra flags work? It should look like this:

twarc2 csv results.jsonl tweets_with_attacks_journalists.csv --extra-input-columns "edit_controls.is_edit_eligible,edit_controls.editable_until,edit_history_tweet_ids,edit_controls.edits_remaining"

@reichaves
Copy link
Author

Thanks, that way it worked

@igorbrigadir
Copy link
Collaborator

Fixed in 0.6.0

Please update with:

pip install --upgrade twarc-csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants