Cortex is a powerful and efficient general-purpose web crawler designed and developed by Newron.ai. It aims to simplify the process of collecting data from various web sources, enabling users to extract valuable information with ease.
- Customizable and extensible crawling rules
- Advanced filtering options to target specific data
- Multithreading support for faster crawling
- Built-in caching and request throttling to prevent overloading target websites
- User-friendly Electron app with React and Tailwind CSS for easy configuration and management
These instructions will help you set up Cortex on your local machine for development and testing purposes.
Before you start, make sure you have the following installed on your system:
-
Clone the repository:
git clone https://github.com/Newron.ai/Cortex.git
-
Navigate to the project directory:
cd Cortex
-
Install the dependencies:
npm install
-
Run the application:
npm start
- Open the Cortex application.
- Configure the crawling rules, filters, and other options through the user interface.
- Start the crawler by clicking the "Start Crawler" button.
- Monitor the progress and view the collected data in the application.
We welcome contributions from the community. If you'd like to contribute to the Cortex project, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them with a descriptive commit message.
- Push your changes to your fork.
- Open a pull request and describe the changes you made.
Please make sure to follow our coding standards and guidelines when contributing.
This project is licensed under the MIT License. See the LICENSE file for more details.