Social Scraper is a console tool developed in Node.js that uses Playwright to perform web scraping on various social media platforms.
Currently, Social Scraper supports the following providers:
- X (Twitter)
- Multi-Platform Support: Compatible with different social media platforms through specific providers.
- Structured Storage: Saves the results in JSON files, organized by provider name and date.
- Session Management: Handles active sessions for efficient scraping.
- Node.js (version 18 or higher)
- npm
-
Clone the Repository
git clone https://github.com/code3743/social-scraper.git
-
Navigate to the Project Directory
cd social-scraper
-
Install Dependencies
npm install
Social Scraper is a console-based tool. To run it, use the following command:
node app.js
The results are stored in the /results
folder as JSON files in the format providerName-date.json
. Each file contains an array of posts with the following structure:
- id: The post's identifier.
- content: The textual content of the post.
- media: An array of URLs for associated media.
- metadata: An object containing additional relevant information.
If you would like to contribute to Social Scraper, please follow these steps:
-
Fork the Repository
-
Create a Branch for Your Feature or Bug Fix
git checkout -b feature/new-feature
-
Make Your Changes and Commit Them
git commit -m "Description of changes"
-
Push to Your Branch
git push origin feature/new-feature
-
Open a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.