Social Media Scraper

Introduction

This program is a Python script that utilizes the Selenium web driver and the PRAW (Python Reddit API Wrapper) library to scrape a specified number of tweets containing a particular keyword from Twitter and a specified number of posts from a subreddit on Reddit. The scraped data is stored in an Excel file that can be easily accessed and analyzed.

Dependencies

To use this program, you need to have the following installed on your system:

Python Reddit API Wrapper | PRAW

You can install these dependencies using pip, by running the following commands:

pip install selenium
pip install pandas
pip install praw

How to Use

Clone this repository or download the script to your local machine.
Open the script in a Python IDE or text editor.
Change the path of your Firefox binary and geckodriver executable on lines 17 and 18.
Run the script and enter the required inputs in the command prompt when prompted:
- Your Twitter username
- Your Twitter password
- The number of tweets you want to scrape
- The keyword you want to search for
- The name of the Excel file to be stored
The script will start scraping tweets and store them in an Excel file with the specified name.
The Excel file will be automatically opened after the script has finished running.

Clone this repository or download the script to your local machine.
Create a Reddit Script app and get your credential and secret.
Open the script in a Python IDE or text editor.
Run the script and enter the required inputs in the command prompt when prompted:
- The subreddit name you want to scrape
- The number of posts you want to scrape
- The name of the CSV file to be stored
The script will start scraping posts and store them in an CSV file with the specified name.
The CSV file will be automatically opened after the script has finished running.

Functionality and Features

The program scrapes a specified number of tweets containing a particular keyword from Twitter and a specified number of posts from a subreddit on Reddit.
The scraped tweets/posts are stored in an Excel or CSV file.
The program uses the Selenium web driver to automate the process of logging in to Twitter (if required) and searching for tweets.
The program prompts the user to enter their Twitter login credentials (if required) and the number of tweets they want to scrape (if they want to scrape Twitter).
The program allows the user to specify the keyword they want to search for (if they want to scrape Twitter), the subreddit they want to scrape (if they want to scrape Reddit), and the name of the Excel or CSV file to be stored.

Where to Contribute?

Contributions to this project are welcome! In addition to improving the existing Twitter and Reddit scrapers, there are opportunities to develop similar scripts for other social media websites such as Facebook, Instagram, and more.

If you're interested in contributing, here are some ideas:

Develop a scraper for a different social media website
Improve the existing Twitter/Reddit scraper by adding new features or optimizing performance
Create a user-friendly UI for the scraper ✅
Add support for scraping multimedia content such as images and videos
Implement natural language processing techniques to analyze the scraped content

To contribute, you can fork the repository, make your changes, and submit a pull request. Before making any major changes, please create an issue to discuss your proposed changes with the project maintainers.

We appreciate any contributions to this project and look forward to seeing what the community can create!

How to Contribute

Contributions to this project are always welcome! If you would like to contribute, please follow these steps:

Fork the repository
Clone the repository to your local machine
Create a new branch for your feature or bug fix
Make your changes and commit them with descriptive commit messages
Push your changes to your fork
Create a pull request to the main repository

License

This project is licensed under the MIT License - see the LICENSE file for details. By contributing to this project, you agree that your contributions will be licensed under its MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
LICENSE		LICENSE
README.md		README.md
Reddit_Scraper.py		Reddit_Scraper.py
Twitter_Scraper.py		Twitter_Scraper.py
test.html		test.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Media Scraper

Introduction

Dependencies

How to Use

Functionality and Features

Where to Contribute?

How to Contribute

License

About

Languages

License

drowsy-coder/Social-Scraper

Folders and files

Latest commit

History

Repository files navigation

Social Media Scraper

Introduction

Dependencies

How to Use

Functionality and Features

Where to Contribute?

How to Contribute

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages