This is a very basic tutorial on scraping data from Reddit. Although it is written in python it does not require knowing it very well. The tutorial was created for people who will run the interactive notebooks in Google Colab
but obviously it is possible to use it in Jupyter Notebook
. However, the latter requires being a more advanced user who knows how to install packages on their local machine.
Users who want to use the tutorial online in Google Colab
should follow these steps to access these interactive notebooks:
- Go to www.colab.research.google.com (it is better to have a Google Account but not necessary).
- Press GitHub in the popup window or press File and Open notebook.
- Type
MikoBie
in the search box (compare the picture below). - Pick the relevant repository:
reddit
- Choose the relevant notebook and click Open Notebook.
That is it, an interactive notebook should open.
For more advanced users I recommend running this tutorial on their local machines. In the long shot, it will allow scraping more data because even though downloading a lot of data through Jupyter Notebook
is an ill idea at least the environment for more advanced queries would be already created.
- python3.9 (anaconda distribution is preferred)
- other python dependencies are specified in
requirenments.txt
- Clone the repo: git@github.com:MikoBie/wgi.git
- Set up the proper virtual environment with python3.9
- Install all the dependencies from
requirenments.txt