Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't track hashtags with '#' in the 'filter' query #359

Closed
glocalglocal opened this issue Oct 25, 2020 · 4 comments
Closed

Can't track hashtags with '#' in the 'filter' query #359

glocalglocal opened this issue Oct 25, 2020 · 4 comments

Comments

@glocalglocal
Copy link

I am trying to track two hashtags, the equivalent of "#string1 OR #string2".

This works:
twarc filter string1,string2 > filename

but I can't find a working syntax that includes the '#' either by experimentation or in twitter's documentation. I tried comma, 'OR', and single/double quotes. Even twarc filter #string1 > filename fails to capture any tweets. Is there a way to specify hashtags at all?

@edsu
Copy link
Member

edsu commented Oct 25, 2020

Yes, this is the shell getting in the way because # is used for comments in most unix shells. Try quoting your filter string like this:

twarc filter '#string1,#string2' > filename

I believe the docs for filter state that filtering on string1 will include tweets with #string1 in them. But who knows what really goes on behind the scenes...

@glocalglocal
Copy link
Author

'#string1,#string2' doesn't work. I am not sure if the '#' can be escaped in a shell command.

Yes, string1 will include #string1 apparently. If there is no other way, I may have to collect promiscuously and later filter out the tweets without #string1. A Python utility might be an idea, but I am not sure I can do that just yet.

@edsu
Copy link
Member

edsu commented Oct 25, 2020

Yes filtering on hashtags should be pretty easy after the fact. If you need help let us know in the docnow slack: https://bit.ly/docnow-slack

@igorbrigadir
Copy link
Contributor

Closing some old issues - this depends on where you run twarc, but this works:

twarc filter "#cats,#dogs" > hashtags.jsonl

How to escape # in the command line / shell, depends on the OS and the shell itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants