Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added universal search function which passes testing #148

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
57 changes: 42 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Linux and macOS:
```bash
git clone https://github.com/bisguzar/twitter-scraper.git
cd twitter-scraper
sudo python3 setup.py install
sudo python3 setup.py install
```

Also, you can install with PyPI.
Expand All @@ -37,23 +37,51 @@ pip3 install twitter_scraper
Just import **twitter_scraper** and call functions!


### → function **get_tweets(query: str [, pages: int])** -> dictionary
You can get tweets of profile or parse tweets from hashtag, **get_tweets** takes username or hashtag on first parameter as string and how much pages you want to scan on second parameter as integer.
### → function **get_tweets(query: str, search: str [, pages: int])** -> dictionary
You can get tweets of profile or parse tweets from hashtag, **get_tweets** takes username or hashtag on first parameter as string and how many pages you want to scan on second parameter as integer.

#### Keep in mind:
* First parameter need to start with #, number sign, if you want to get tweets from hashtag.
* **pages** parameter is optional.
*get_tweets* function now supporting 'search' paramter for new search functionality.

To enable backwards compatibility with existing twitter_scraper API users, `query` can be directly addressed by using `query=` or by providing a positional string. You can get tweets of a given twitter user or parse tweets from a provided hashtag.

Example:

```python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from twitter_scraper import get_tweets
>>>
>>>
>>> for tweet in get_tweets('twitter', pages=1):
... print(tweet['text'])
...
spooky vibe check
...
Which will function identically to:
>>> from twitter_scraper import get_tweets
>>>
>>> for tweet in get_tweets(query='twitter', pages=1):
... print(tweet['text'])
...
```

If `search` is specified, **get_tweets** will yield a dictionary for each tweet which contains the given term. The term can be any string, supporting search keywords of twitter.


#### Keep in mind:
* You must specify either `query`, or `search`. If you supply one string, `query` will be used by default.
* You can not use more than one string, and you cannot specify more than one of the two search arguments (`query`,`search`)
* **pages** parameter is optional, default is 25.

```python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from twitter_scraper import get_tweets
>>>
>>> for tweet in get_tweets(search='to:bugraisguzar', pages=1):
... print(tweet['text'])
...
pic.twitter.com/h24Q6kWyX8
```

Expand All @@ -78,7 +106,7 @@ It returns a dictionary for each tweet. Keys of the dictionary;
You can get the Trends of your area simply by calling `get_trends()`. It will return a list of strings.

```python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from twitter_scraper import get_trends
Expand All @@ -91,7 +119,7 @@ You can get personal information of a profile, like birthday and biography if ex


```python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from twitter_scraper import Profile
Expand All @@ -109,7 +137,7 @@ Type "help", "copyright", "credits" or "license" for more information.
**to_dict** is a method of *Profile* class. Returns profile datas as Python dictionary.

```python
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from twitter_scraper import Profile
Expand All @@ -118,8 +146,6 @@ Type "help", "copyright", "credits" or "license" for more information.
{'name': 'Buğra İşgüzar', 'username': 'bugraisguzar', 'birthday': None, 'biography': 'geliştirici@peptr', 'website': 'bisguzar.com', 'profile_photo': 'https://pbs.twimg.com/profile_images/1199305322474745861/nByxOcDZ_400x400.jpg', 'banner_photo': 'https://pbs.twimg.com/profile_banners/1019138658/1555346657/1500x500', 'likes_count': 2512, 'tweets_count': 756, 'followers_count': 483, 'following_count': 255, 'is_verified': False, 'is_private': False, user_id: "1019138658"}
```



## Contributing to twitter-scraper
To contribute to twitter-scraper, follow these steps:

Expand All @@ -139,6 +165,7 @@ Thanks to the following people who have contributed to this project:
* @bisguzar (maintainer)
* @lionking6792
* @ozanbayram
* @sean-bailey
* @xeliot


Expand Down
11 changes: 11 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,17 @@ def test_languages(self):
self.assertIsInstance(tweets[0]["replies"], int)
self.assertGreaterEqual(tweets[1]["retweets"], 0)

class TestSearch(unittest.TestCase):
def search_25pages(self):
tweets = list(get_tweets(search="hello, world!", pages=2))
self.assertGreater(len(tweets), 1)
def search_user(self):
user = "gvanrossum"
tweets = list(get_tweets(user, pages=2))
self.assertGreater(len(tweets), 1)




class TestTrends(unittest.TestCase):
def test_returned(self):
Expand Down
17 changes: 11 additions & 6 deletions twitter_scraper/modules/tweets.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,22 @@

session = HTMLSession()

def get_tweets(query, pages=25):
def get_tweets(query=None, search=None, pages=25):
"""Gets tweets for a given user, via the Twitter frontend API."""

if not query and not search:
raise RuntimeError("Please specify a 'query' or a 'search' to check the tweets on.")
elif query and search:
raise RuntimeError("Please specify only one of either a 'search' or 'query'.")

after_part = (
f"include_available_features=1&include_entities=1&include_new_items_bar=true"
)
if query.startswith("#"):
if not query: # if query not exists, it's a search method
search_term=quote(search)
url = f"https://twitter.com/i/search/timeline?f=tweets&vertical=default&q={search_term}&src=tyah&reset_error_state=false&"

elif query.startswith("#"):
query = quote(query)
url = f"https://twitter.com/i/search/timeline?f=tweets&vertical=default&q={query}&src=tyah&reset_error_state=false&"
else:
Expand Down Expand Up @@ -59,13 +68,9 @@ def gen_tweets(pages):


tweet_id = tweet.attrs["data-item-id"]

tweet_url = profile.attrs["data-permalink-path"]

username = profile.attrs["data-screen-name"]

user_id = profile.attrs["data-user-id"]

is_pinned = bool(tweet.find("div.pinned"))

time = datetime.fromtimestamp(
Expand Down