Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved output formatting, with option for CSV/JSON #6

Merged
merged 7 commits into from Mar 13, 2023

Conversation

archy-bold
Copy link
Contributor

This PR improves the output formatting to allow users to choose for -Format csv or -Format json for improved output handling.

Default "csv" mode will output valid CSV rather than the old comma-separated format that could cause issues when dates, tweets, or user display names contained commas.

The "json" option outputs a list of JSON strings for each tweet in the format {id, url, text, username, fullname, timestamp}.

The IDE also did some code linting to imports and remove spacing, which I've opted to leave in.

This could be further improved by moving the FormatTweets function out of the Scrape function, but I left it working as it used to.

Reformatted format functionality
Keeping csv default format
Not printing empty lines in csv mode
@archy-bold
Copy link
Contributor Author

Examples:

$ ./twint-lite -Query "(from:WD_CFC) since:2023-01-31_17:30:00_UTC  until:2023-01-31_21:45:00_UTC" -Instance birdsite.xanny.family -Format json
{"id":1620537242551992320,"url":"https://twitter.com/WD_CFC/status/1620537242551992320","text":"FT: Wythenshawe Town 0-3 West. A sensational second-half display and a big three points heading back up the Airport line!  ⚽ Matthews  Davis  Kirongozi","username":"@WD_CFC","fullname":"West Didsbury \u0026 Chorlton","timestamp":"Jan 31, 2023 · 9:39 PM UTC"}
{"id":1620536293645225984,"url":"https://twitter.com/WD_CFC/status/1620536293645225984","text":"90+1. Another West sub as we enter an unknown amount of added time  as Delaney replaces Swift.","username":"@WD_CFC","fullname":"West Didsbury \u0026 Chorlton","timestamp":"Jan 31, 2023 · 9:35 PM UTC"}
{"id":1620535833609764865,"url":"https://twitter.com/WD_CFC/status/1620535833609764865","text":"89. West sub. Eme off and replaced by Westall.","username":"@WD_CFC","fullname":"West Didsbury \u0026 Chorlton","timestamp":"Jan 31, 2023 · 9:33 PM UTC"}
{"id":1620532753333907456,"url":"https://twitter.com/WD_CFC/status/1620532753333907456","text":"77. Penalty to Wythy. You know what happened next. Still 3-0 to West. 🧤","username":"@WD_CFC","fullname":"West Didsbury \u0026 Chorlton","timestamp":"Jan 31, 2023 · 9:21 PM UTC"}
$ ./twint-lite -Query "(from:WD_CFC) since:2023-01-31_17:30:00_UTC  until:2023-01-31_21:45:00_UTC" -Instance birdsite.xanny.family -Format csv
1620537242551992320,https://twitter.com/WD_CFC/status/1620537242551992320,"Jan 31, 2023 · 9:39 PM UTC",@WD_CFC,West Didsbury & Chorlton,FT: Wythenshawe Town 0-3 West. A sensational second-half display and a big three points heading back up the Airport line!  ⚽ Matthews  Davis  Kirongozi
1620536293645225984,https://twitter.com/WD_CFC/status/1620536293645225984,"Jan 31, 2023 · 9:35 PM UTC",@WD_CFC,West Didsbury & Chorlton,90+1. Another West sub as we enter an unknown amount of added time  as Delaney replaces Swift.
1620535833609764865,https://twitter.com/WD_CFC/status/1620535833609764865,"Jan 31, 2023 · 9:33 PM UTC",@WD_CFC,West Didsbury & Chorlton,89. West sub. Eme off and replaced by Westall.
1620532753333907456,https://twitter.com/WD_CFC/status/1620532753333907456,"Jan 31, 2023 · 9:21 PM UTC",@WD_CFC,West Didsbury & Chorlton,77. Penalty to Wythy. You know what happened next. Still 3-0 to West. 🧤

@archy-bold
Copy link
Contributor Author

Further bug fixes:

  1. Not stripping commas, semicolons, newlines, etc from tweet text as formatters handle them now
  2. Tweet IDs are strings, since they can cause floating point number issues
  3. Exiting if no tweets are found

@HackerDaGreat57
Copy link

This got famous over on r/ProgrammerHumor

@wise-introvert
Copy link

This got famous over on r/ProgrammerHumor

Hello fellow traveller!

@LEGENDARY-KING
Copy link

@afkvido
Copy link

afkvido commented Mar 11, 2023

This got famous over on r/ProgrammerHumor

hell yeah it did

@xybrs
Copy link

xybrs commented Mar 11, 2023

LGTM. Greetings from programmer humor

@PROxZIMA
Copy link

$42,000 for a single API? Hell nahh

@pielco11
Copy link
Member

pielco11 commented Mar 13, 2023

Hello everyone,

sorry for missing this.

Nice work btw

image

@pielco11 pielco11 merged commit d642400 into twintproject:main Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants