remove archive file naming logic #51

edsu · 2015-01-28T11:11:19Z

I think it would simplify the code quite a bit if twarc simply wrote tweets to stdout and let the user decide what file they should go.

When run repeatedly twarc tries to determine the since_id to use when talking to the Twitter API based on data that has already been archived. But this functionality is dependent on twarc being run in the same directory as the other archive files, and the filenames matching a particular pattern (which can get ugly). The determination of the since_id isn't working properly with files created with --stream since they are ordered differently.

I propose this logic is removed and we add a --min_id option to match --max_id. The user can then control what they want to do, and where the data goes.

edsu added the enhancement label Jan 28, 2015

edsu added a commit that referenced this issue Jan 30, 2015

upping version for release ; fixes #48 #51 #39

271570c

edsu closed this as completed Jan 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove archive file naming logic #51

remove archive file naming logic #51

edsu commented Jan 28, 2015

remove archive file naming logic #51

remove archive file naming logic #51

Comments

edsu commented Jan 28, 2015