## How to use
To use [boilerpipe](https://boilerpipe-web.appspot.com/) API you can make request via any HTTP framework. In examples below it's python and requests


In [16]:
import json
import requests 

SERVER_IP = 'localhost'
SERVER_PORT = '8080'
TARGET_URL = 'https://www.bbc.com/news/uk-politics-47796377'

### Example 1: Extract article from bbc with default ArticleExtactor
In this example, API is used to extract article from bbc.com website.

In [None]:
resp = requests.get('http://{}:{}/extractText?url={}'.format(SERVER_IP, SERVER_PORT, TARGET_URL))
resp_dict = json.loads(resp.text)
# At this point resp_dict looks like
# {'url': 'https://www.bbc.com/...', 'extractedText': 'Brexit deadlock: The Commons in numbers ...'}

print('URL={url}\n\nExtractedText:\n{extractedText}'.format(**resp_dict))

### Example 2: Extract article in HTML from bcc with KeepEverythingExtractor

In [None]:
extractor = 'KeepEverythingExtractor'
resp = requests.get('http://{}:{}/extractHTML?url={}&extractor={}'.format(SERVER_IP, SERVER_PORT, TARGET_URL, extractor))
resp_dict = json.loads(resp.text)
# At this point resp_dict looks like
# {'url': 'https://www.bbc.com/...', 'extractedHTML': '<HTML lang="en" id="responsive-news"> ...'}

print('URL={url}\n\extractedHTML:\n{extractedHTML}'.format(**resp_dict))

#### HTML block below is used to align tables in this notebook to the left

In [17]:
%%html
<style>
table {align:left; display:block}
</style>

### Methods

Right now, next 2 methods are available: 

| Method | Description  |
|:---|---|
|``extractText``| Retrieve Plain Text using chosen extractor|
|``extractHTML``| Retrieve HTML using chosen extractor | 

**Resonse will be given in the next format**:

``extractText`` will return **JSON** with next keywords: ``url`` and ``extractedText``

``extractHTML `` will return **JSON** with next keywords: ``url`` and ``extractedHTML ``


### Arguments

| Argument | Description  |
|:---|---|
|  url  |  The url of the website to apply extractor |
| extractor | Type of the extractor to be used  |

### Extractors

Only extractors that are listed below could be used with this API:

- ``DefaultExtractor``
- ``ArticleExtractor``
- ``ArticleSentencesExtractor``
- ``KeepEverythingExtractor``
- ``KeepEverythingWithMinKWordsExtractor``
- ``LargestContentExtractor``
- ``NumWordsRulesExtractor``
- ``CanolaExtractor``

To read more about extractors, please visit: 
[https://boilerpipe-web.appspot.com/](https://boilerpipe-web.appspot.com/)

