About

Scraper

Web Scraper API

Scrape any website's H1, H2, H3 and link tags.

Getting Started with Scraper

These instructions will get you a copy of the project up and running on your local machine for development.

Prerequisites

Things you need to install beforehand:

Rails - Ruby Framework

Installing

Open terminal and run the following lines of code to clone and run this project.

$ git clone https://github.com/SeeYouSpaceCowboy/api-scraper.git
$ cd api-scraper
$ bundle install
$ rails s

This project should now be running locally on port 3000.

About

These instructions are a small documentation of the how the API should behave.

Example calls are made using JavaScript's axios npm package.

Scrape Given URL

Saves H1, H2, H3 and links from a given URL to the database.

URL Endpoint	Method	URL Params	Success Response
`/url`	`POST`	Required: `link=[string]`	`200`

Sample Call:

  axios.post('http://localhost:3000/v1/urls', { url: 'http://dailynews.com' })
    .then(response => response.data)
    .catch(error => error)

Content:

[
  {
    "link": "http://dailynews.com",
    "h1": [
      {
          "content": "Passenger dies after car crash in North Hollywood",
          "link": "http://www.dailynews.com/2017/10/27/1-in-critical-condition-after-car-crash-in-north-hollywood/"
      },
      ...
    ],
    "h2": [
      {
          "content": "LA Metro security guards attacked near Watts station; one shot at with his own gun",
          "link": "http://www.dailynews.com/2017/10/27/la-metro-security-guards-attacked-near-watts-station-one-shot-at-with-his-own-gun/"
      },
      ...
    ],
    "h3": [ ... ],
    "a": [ ... ]
  },
  ...
]

Get All H1, H2, H3 and Links

Get back H1, H2, H3 and links from a previously saved URLs from the database.

URL Endpoint	Method	URL Params	Success Response
`/url`	`GET`	N/A	`200`

Sample Call:

  axios.get('http://localhost:3000/v1/urls')
    .then(response => response.data)
    .cathc(error => error)

Content:

[
  {
    "link": "http://dailynews.com",
    "h1": [
      {
          "content": "Passenger dies after car crash in North Hollywood",
          "link": "http://www.dailynews.com/2017/10/27/1-in-critical-condition-after-car-crash-in-north-hollywood/"
      },
      ...
    ],
    "h2": [
      {
          "content": "LA Metro security guards attacked near Watts station; one shot at with his own gun",
          "link": "http://www.dailynews.com/2017/10/27/la-metro-security-guards-attacked-near-watts-station-one-shot-at-with-his-own-gun/"
      },
      ...
    ],
    "h3": [ ... ],
    "a": [ ... ]
  },
  ...
]

Contributors

Scraper was built by Mohammed Chisti.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
bin		bin
config		config
db		db
lib/tasks		lib/tasks
log		log
public		public
test		test
tmp		tmp
vendor		vendor
.gitignore		.gitignore
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
Rakefile		Rakefile
app.json		app.json
config.ru		config.ru

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scraper

Getting Started with Scraper

Prerequisites

Installing

About

Scrape Given URL

Get All H1, H2, H3 and Links

Contributors

About

Uh oh!

Releases

Packages

Languages

WanderingObserver/api-scraper

Folders and files

Latest commit

History

Repository files navigation

Scraper

Getting Started with Scraper

Prerequisites

Installing

About

Scrape Given URL

Get All H1, H2, H3 and Links

Contributors

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages