Stats4R shows you the vital statistics of Reddit's most popular online dating sub.
It is a web scraper and data analysis program written specefically for /r/r4r.
- Gender composition
- Average age of males VS females
- Who's seeking who
- Average comments on posts by males VS females
- Average upvotes to posts by males VS females
- Dataset containing hundreds of r4r posts
Screenshot of features in action
Stats4R is built with Python 3. You can check your version of Python by entering the following command in a terminal window:
python --version
If your system does not have Python or has Python 2, please download and install the latest version of Python 3.
The following Python libraries are required to run Stats4R:
pip install beautifulsoup4
pip install requests
pip install matplotlib
pip install Flask
If you have the Anaconda distribution of Python you won't need to install the above requirements.
Running Stats4R is simple.
- Download the repository and unzip it to your desired location
- Navigate to the unzipped folder and open a terminal window
- Enter the command below:
./stats4r.sh
-
Sit back and wait. Depending on your internet connection Stats4R should take approximately 20 minutes to crawl 25 pages of /r/r4r and analyze about 600 posts
-
Open your browser and go to http://localhost:5000/ or http://127.0.0.1:5000/
To quit, press ctrl + c
in the terminal window where the program is running.
Windows users can run Stats4R by running the Python scripts in the following sequence, with the commands below:
python spider_r4r.py
python analyse_r4r.py
python plot_r4r.py
python front_r4r.py
A .bat
file will be created soon to run the program with a single command.
Please feel free to make pull requests or fork the repo.
For any feature requests or suggestions, send a message at sharmaeshaanw@gmail.com -- critical feedback is welcome!
Star the repo if you liked the project ⭐
If you're interested in doing something similar to Stats4R, I'd love to talk about it! 🙌