Web Scraper API
Scrape any website's H1, H2, H3 and link tags.
These instructions will get you a copy of the project up and running on your local machine for development.
Things you need to install beforehand:
- Rails - Ruby Framework
Open terminal and run the following lines of code to clone and run this project.
$ git clone https://github.com/SeeYouSpaceCowboy/api-scraper.git
$ cd api-scraper
$ bundle install
$ rails s
This project should now be running locally on port 3000
.
These instructions are a small documentation of the how the API should behave.
Example calls are made using JavaScript's axios
npm package.
Saves H1, H2, H3 and links from a given URL to the database.
URL Endpoint | Method | URL Params | Success Response |
---|---|---|---|
/url |
POST |
Required: link=[string] |
200 |
Sample Call:
axios.post('http://localhost:3000/v1/urls', { url: 'http://dailynews.com' })
.then(response => response.data)
.catch(error => error)
Content:
[
{
"link": "http://dailynews.com",
"h1": [
{
"content": "Passenger dies after car crash in North Hollywood",
"link": "http://www.dailynews.com/2017/10/27/1-in-critical-condition-after-car-crash-in-north-hollywood/"
},
...
],
"h2": [
{
"content": "LA Metro security guards attacked near Watts station; one shot at with his own gun",
"link": "http://www.dailynews.com/2017/10/27/la-metro-security-guards-attacked-near-watts-station-one-shot-at-with-his-own-gun/"
},
...
],
"h3": [ ... ],
"a": [ ... ]
},
...
]
Get back H1, H2, H3 and links from a previously saved URLs from the database.
URL Endpoint | Method | URL Params | Success Response |
---|---|---|---|
/url |
GET |
N/A | 200 |
Sample Call:
axios.get('http://localhost:3000/v1/urls')
.then(response => response.data)
.cathc(error => error)
Content:
[
{
"link": "http://dailynews.com",
"h1": [
{
"content": "Passenger dies after car crash in North Hollywood",
"link": "http://www.dailynews.com/2017/10/27/1-in-critical-condition-after-car-crash-in-north-hollywood/"
},
...
],
"h2": [
{
"content": "LA Metro security guards attacked near Watts station; one shot at with his own gun",
"link": "http://www.dailynews.com/2017/10/27/la-metro-security-guards-attacked-near-watts-station-one-shot-at-with-his-own-gun/"
},
...
],
"h3": [ ... ],
"a": [ ... ]
},
...
]
Scraper was built by Mohammed Chisti.