-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tweets.py #429
Comments
This one right? https://github.com/DocNow/twarc/blob/main/utils/tweets.py It doesn't matter to it what the actual file format is, it will still read it line by line, so it should work with |
Yes, that one. That's what I thought but I receive this error when I run the utility on jsonl files only: Traceback (most recent call last): |
Ah, if tweets are in a different format than the default, for example - if they're longer than 140 characters, the text field is |
Ah, I see. So how do we account for this without having to manually change the python code from tweet["text"] to tweet["full_text"]? I'm still fairly new to json, but is there a way to do an if statement to check if tweet["full_text"] is in the json and if not, print "text"? Or vice versa? print(("[%s] @%s: %s (%s)" % ( |
Unfortunately this is something that might require a code change - but yes, the change would involve a chain of
And use it like this:
apologies if there's an error here - i didn't test this Also, this will only work with |
Ah that's unfortunate that it doesn't workout of the box like that. Can you paste in those json examples in here? I'd like to check it out later. |
Here are the tweet id txt files for the jsonl and json examples respectively. |
going over some old issues - in this case, i would now recommend using twarc2 and twarc-csv:
And to get just id,text,username:
|
Is there a way to expand tweets.py to work on jsonl as well as json?
The text was updated successfully, but these errors were encountered: