Skip to content

Using Spark & Elasticsearch to process streaming data from the Twitter API

License

Notifications You must be signed in to change notification settings

MarwanMashra/Twitter-Streaming-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Twitter Streaming Data

Files

How to run

  1. Start by launching the server from stream.ipynb. The port by default is 4040 but you can change it. Also you can change the Bearer_token if you want to. Run it and make sure you're server is listening to the port you choose (it shows a message).
  2. Launch a server Elasticsearch (in a docker container or not). We assume that you'll use the port 9200, but you can choose another one.
  3. Put the file process.ipynb in an environment where spark is installed, we used a docker container, but you can use what you want. Change the ports in the top according to the ports you chose previously (STREAM_PORT, ELASTIC_PORT). No need to change them if you used the default ones.
  4. In the same place, change the host to localhost if you're not running the code in a docker container. if you're, keep it at host.docker.internal
  5. You can now run the code of process.ipynb.

Releases

No releases published

Packages

No packages published