Releases: DocNow/twarc-csv
v0.7.2
v0.7.1
Small bug fix to not crash when entities
are missing in a tweet.
v0.7.0
This new version changes the CSV format significantly since the previous one:
- Added extra fields for usernames (
in_reply_to_username
,retweeted_username
,quoted_username
) previously these were IDs only. - Process entities by default (extract expanded URLs, json objects with indexes for hashtags and mentions are now just lists)
- Add new command line option
--process-entities
to turn this off - New twitter data field for users:
verified_type
- Rearranged the order of output columns in the CSV which may affect some things too.
v0.6.0
Added new columns to support edited tweets, and missing rule match column for streamed tweets.
v0.5.2
New version adds ability to convert lists, and validates output columns.
v0.5.1
Fix package not including modules and Readme.
v0.5.0
This version significantly changes the output CSV format and defaults.
It is recommended to re-run the export as the columns have changed.
Previously, the default was to output original retweets as is, and insert referenced tweets inline. With this version, referenced tweets are not inserted (so by default you will only see the tweets from data
not any referenced replies or quotes) and retweets are now returned in "merged" form, where the tweet text is replaced with the non-truncated original with matching entities etc.
This version also requires a newer version of twarc. Update everything with:
pip install --upgrade twarc twarc-csv
v0.3.8
Fix parsing non tweet objects from the stream, adding this to report output.
v0.3.7
Fix ability to use pipes. The check for empty files would throw an error if using STDIN
v0.3.6
- Don't include rows for referenced tweets that are missing.
- Change output report slightly.
- Add missing withheld fields