Skip to content

GitHub crawler to search in repositories, issues and wikis through a proxy.

Notifications You must be signed in to change notification settings

AlexLoar/githhub_crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Crawler

This crawler allows you to search in GitHub repositories, wikis or issues according to the keywords you pass to it and it returns a list of URLs of the found items.

It consists of a single endpoint created with FastAPI which handle input and carry out the crawling process.

Usage

To use this crawler, just make use of the Makefile to run the main commands:

Run server make up

Stop server make down

Run the tests make test

Endpoint documentation

To see the online documentation you can go here once the proyect is launched.

POST localhost:8000/crawler Body example

{
  "keywords": [
    "openstack",
    "nova",
    "css"
  ],
  "proxies": [
    "78.110.174.119:8080"
  ],
  "type": "Wikis"
}

keyword: List of keywords to use in the search.

proxies: List of proxies used to make the request to GitHub. One will be picked from the list randomly.

type: Specifies the type of entity where the search will be carried out. May take the following values: Repositories, Wikis or Issues.

About

GitHub crawler to search in repositories, issues and wikis through a proxy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published