SocialCrawler

It is a python package to help get data from Twitter, Foursquare.

This package was created to facilitate the data mining from Twitter and Foursquare. (Only Linux)

Install (generic way)

	$ python3 -m pip install SocialCrawler

How work ?

Requirements

Python >= 3
setuptools
Foursquare developer credentials ( if you wanna work with)
Twitter developer credentials ( if you wanna work with )
geckodriver installed and in $PATH (we got this problem with when try run in Linux Mint and Kali)
- Download from https://github.com/mozilla/geckodriver/releases

    $ export PATH=$PATH:<geckodriver-path>

Possibility

As the package use tweepy as framework to connect with Twitter we can use Twitter Stream API. Therefore you can search based in :
- delimited
- stall_warnings
- filter_level
- language
- follow
- track
- locations
- count
- with
- replies
- stringift_friend_id

As shown in Stream Overview

Getting check-ins shared in Twitter or the check-ins of the last week.
- If you have a Foursquare credential you will be able to track data from specific locations and others.

See Wiki!

v 0.1.0
- fixed module class declaration
v 0.0.9
- fixed syntax erro and hacking method dir output
v 0.0.8
- added selenium as requirements to use foursquare browser request (to avoid rate limit), can not work
- updated ExtractorData to a full version to allow get (almost) full VENUE info (NewExtractorData)
- removed urlib2 as requirements
- updated run flow, now always we will have return just check if the field is NULL, when this happen it is because the data is missing
v 0.0.7
- when VENUE or FOURSQUARE get requests error the program thread will wait 15 minutes to request again
- Added new except treatments
- separeted foursquare request and venue request in two try-except blocks
- fixed write categorie_id bug, missing int to str convert
- yet in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
v 0.0.6
- Formatted to PEP257 and PEP8 (almost)
- Implementaded ExtractorData: a simple way to get data from Foursquare using the swarm url code
- Add HistoricalCollector.CollectorV2 that get all data from json tweet and save as tsv file
- Add in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
- added urllib2 as Requirements
v 0.0.5
- Fixed bug in getStoredData function that allow some parameter be None
- Updated format file name generated
- Increased time wait request from 15 minutos to 16. ( Sometimes when was tried request again -after 15 minutes - the server responded that don't finished the 15 minutes.
- Updated the fields saved. Now all field is saved in a file using \tab format as is shown in Wiki.

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
.idea		.idea
SocialCrawler		SocialCrawler
example		example
.gitignore		.gitignore
0.07		0.07
LICENSE		LICENSE
README.md		README.md
README.rst		README.rst
_config.yml		_config.yml
requirements.txt		requirements.txt
setup.py		setup.py
updatedis.sh		updatedis.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

SocialCrawler

SocialCrawler

example

example

.gitignore

.gitignore

0.07

0.07

LICENSE

LICENSE

README.md

README.md

README.rst

README.rst

_config.yml

_config.yml

requirements.txt

requirements.txt

setup.py

setup.py

updatedis.sh

updatedis.sh

Repository files navigation

SocialCrawler

Install (generic way)

How work ?

Requirements

Possibility

About

Releases

Packages

Languages

License

JosielManzonni/SocialCrawler

Folders and files

Latest commit

History

Repository files navigation

SocialCrawler

Install (generic way)

How work ?

Requirements

Possibility

About

Resources

License

Stars

Watchers

Forks

Languages