# API Connections Project
The goal of this project is to connect APIs through a unified query. This notebook will serve as a Python wrapper for the implemented APIs.

## Build Query
Each API will be called on a common query. The query needs to all it to integrate with the different APIs.

### Example APIs
The specific APIs tested are:
- https://aylien.com/news-api (SDK)
- https://newsapi.org          (SDK)
- https://gnews.io/docs/v2     (API)

#### News API
Methods

    `get_top_headlines()`
        Returns: json
    
    `get_everything()`
        Returns: json
    
Parameters

    Both methods accept:
    - query
    - sources
    - language
    - country
    
    In addtion, `get_everything()` can accept:
    - sortby
    - pages
    - start 
    - end

In [1]:
# Commong settings
QUERY = "Microsoft"

In [2]:
def query_newsapi(query):
    try:
        from newsapi import NewsApiClient
    except ModuleNotFoundError:
        print('> Module not found, installing via pip')
        ! pip install newsapi-python
        from newsapi import NewsApiClient

    # Init
    newsapi = NewsApiClient(api_key='xxxxxxxxxxxxxxxxxxxxxx')

    # /v2/top-headlines
    try:
        top_headlines = newsapi.get_top_headlines(q=query,
                                              category='business',
                                              language='en',
                                              country='us')
        if len(top_headlines['articles']) == 0:
            print(f'> No top headlines found for query = {query}')
    except ValueError:
        print('''
    ValueError:
    This API accepts either sources or category, language, country. Providing an of the later list in addition to the 
    former results in a Value Error''')

    # /v2/everything
    all_articles = newsapi.get_everything(q=query,
                                          sources='bbc-news,the-verge',
                                          domains='bbc.co.uk,techcrunch.com',
                                          language='en',
                                          sort_by='relevancy',
                                          page=2)
    print('> News API found {} article(s) matching the query '.format(len(all_articles['articles'])))
    return top_headlines, all_articles

In [3]:
# # Sample API News code
# apinews = query_newsapi(QUERY)
# apinews[1]['articles']

#### Aylien News API
Methods

    `list_stories()`
        Returns: json
    
    `get_everything()`
        Returns: json
    
Parameters

    Both methods accept:
    - title
    - sort_by
    - language
    - not_language
    - published start

In [4]:
def query_aylien(query):
    try:
        import aylien_news_api
    except ModuleNotFoundError:
        ! pip install git+https://github.com/AYLIEN/aylien_newsapi_python.git
        import aylien_news_api
    from aylien_news_api.rest import ApiException

    # Configure API key authorization: app_id
    aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-ID'] = 'xxxxxxxx'
    # Configure API key authorization: app_key
    aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-Key'] = 'xxxxxxxxxxxxxxxxxxxx'

    # create an instance of the API class
    api_instance = aylien_news_api.DefaultApi()

    opts = {
      'title': query,
      'sort_by': 'social_shares_count.facebook',
      'language': ['en'],
      'not_language': ['es', 'it'],
      'published_at_start': 'NOW-7DAYS',
      'published_at_end': 'NOW',
      'entities_body_links_dbpedia': [
        'http://dbpedia.org/resource/Donald_Trump',
        'http://dbpedia.org/resource/Hillary_Rodham_Clinton'
      ]
    }

    try:
        # List stories
        api_response = api_instance.list_stories(**opts)
#         print("API called successfully. Returned data: ")
#         print("========================================")
#         for story in api_response.stories:
#           print(story.title + " / " + story.source.name)
        print(f'> Aylien News API found {len(api_response.stories)} articles(s) matching the query')
        return api_response
    except ApiException as e:
        print("Exception when calling DefaultApi->list_stories: %sn" % e)
    


In [5]:
## Sample aylien code
#aylien = query_aylien(QUERY)
#aylien.stories

### GNews API
GNews does not provide an SDK for direct python interaction, but they do provide an API. Here, the http request is generated using Python to build the request string and read the response.

#### Building the Request
The query is sent to Query : https://gnews.io/api/v2/?q=example&token=API-Token with the following parameters:

<table class="table table-bordered" style="margin-bottom: 0;">
                                <thead>
                                    <tr>
                                        <th scope="col">Parameter</th>
                                        <th scope="col">Information</th>
                                        <th scope="col">Default value</th>
                                        <th scope="col">Description</th>
                                    </tr>
                                </thead>
                                <tbody>
                                    <tr>
                                        <th scope="row">q</th>
                                        <td>Required</td>
                                        <td>None</td>
                                        <td>Your search</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">token</th>
                                        <td>Required</td>
                                        <td>None</td>
                                        <td>Your API Token</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">max</th>
                                        <td>Optional</td>
                                        <td>10</td>
                                        <td>The number of items you want (max possible : 100)</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">lang</th>
                                        <td>Optional</td>
                                        <td>en</td>
                                        <td>Language of articles (list of all language below)</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">country</th>
                                        <td>Optional</td>
                                        <td>us</td>
                                        <td>The primary article origin (list of all country below)</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">mindate</th>
                                        <td>Optional</td>
                                        <td>None</td>
                                        <td>Get articles that are more recent than the min date</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">maxdate</th>
                                        <td>Optional</td>
                                        <td>None</td>
                                        <td>Get articles that are less recent than the max date</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">in</th>
                                        <td>Optional</td>
                                        <td>all</td>
                                        <td>Get articles that contains q in the specified article section</td>
                                    </tr>
                                    <tr>
                                        <th scope="row">image</th>
                                        <td>Optional</td>
                                        <td>optional</td>
                                        <td>Get articles with images required or with images optional</td>
                                    </tr>
                                </tbody>
                            </table>
                            
The request returns a response like:
``` #javascript
{
    "timestamp": 1549032306,
    "count_results": 1,
    "articles": [
        {
            "title": "This Tiny Router Could be the Next Big Thing",
            "desc": "It seems like only yesterday that the Linksys WRT54G and the various open source firmware replacements for it were the pinnacle of home router hacking.",
            "link": "https://hackaday.com/2019/02/01/this-tiny-router-could-be-the-next-big-thing/",
            "website": "https://hackaday.com",
            "source": "Hackaday",
            "date": "Fri, 01 Feb 2019 09:00:00 GMT",
            "image": "https://lh5.googleusercontent.com/proxy/IBqrCuhBM7bQOO6kXY_pnkM3D3OEta9U3v4O_ieACE_Xq9hQTB7SvHEgmpzdyxK2uARoQBJijHdOE3HWdmckMROCd4itXCVh9-rXpgdQn2hhmw=-c"
        }
    ]
}
```

In [17]:
def query_gnews(query):
    ''' Use the GNews API to create a dictionary of results
    Parameters
    ----------
        query: str
            The term to be searched in GNews
    Rerturns
    --------
        response: dict
            Get the json of the response and convert to a dictionary to match output of other APIs
    '''
    
    import requests
    url = 'https://gnews.io/api/v2/?'
    url_test = 'https://gnews.io/api/v2/?q=example'
    # Required Parameters
    q = query
    api_token = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
    # Optional Parameters
    max_count = 'max', ''
    lang = 'lang','en'
    country ='country', 'us'
    mindate = 'mindate',''
    maxdate = 'maxdate',''
    in_section = 'in', ''
    image  ='image',''


    request_url = f'{url}token={api_token}&q={query}'
    for name, param in [max_count, lang, country, mindate, maxdate, in_section, image]:
        if param != '':
            request_url += f'&{name}={param}'
    #print(request_url)
    response = requests.get(request_url)
#     if response:
#         print('Success!')
#     else:
#         print('An error has occurred.')
    print('> GNews API found {} articles(s) matching the query'.format(len(response.json()['articles'])))
    return dict(response.json())

In [18]:
# # Sample gnews code
# gnews = query_gnews(QUERY)
# gnews['articles']

In [19]:
def query_all(query):
    print(f'> Searching for {query}')
    apinews = query_newsapi(query)
    aylien = query_aylien(query)
    gnews = query_gnews(query)
    
    links = set([])
    for result in apinews:
        for article in result['articles']:
            links.add(article['url'])
    
    for story in aylien.stories:
        links.add(story.links.permalink)
            
    for article in gnews['articles']:
        links.add(article['link'])
        
    print(f"> Found {len(links)} unique links for the query.")
    return links


In [20]:
links = query_all(QUERY)

> Searching for Microsoft
> News API found 20 article(s) matching the query 
> Aylien News API found 9 articles(s) matching the query
> GNews API found 10 articles(s) matching the query
> Found 39 unique links for the query.


In [21]:
links

{'http://techcrunch.com/2019/09/16/salesforce-doubles-down-on-verticals-launches-manufacturing-and-consumer-goods-clouds/',
 'http://techcrunch.com/2019/09/17/linkedin-launches-skills-assessments-tests-that-let-you-beef-up-your-credentials-for-job-hunting/',
 'http://techcrunch.com/2019/09/18/github-acquires-code-analysis-tool-semmle/',
 'http://techcrunch.com/2019/09/23/facebook-buys-startup-building-neural-monitoring-armband/',
 'http://techcrunch.com/2019/09/24/alibaba-unveils-hanguang-800-an-ai-inference-chip-it-says-significantly-increases-the-speed-of-machine-learning-tasks/',
 'http://techcrunch.com/2019/09/24/windows-10-now-runs-on-over-900m-devices/',
 'http://techcrunch.com/2019/09/25/how-amazon-is-closing-out-competitors-by-opening-up-voice/',
 'http://techcrunch.com/2019/10/01/daily-crunch-wework-delays-its-ipo/',
 'http://techcrunch.com/2019/10/01/pitch-a-presentation-startup-from-wunderlists-founders-raises-30m-more-to-take-on-powerpoint/',
 'http://techcrunch.com/2019/10

### Common Parameters

The following parameters are commong among all APIs:
- search term (query, title)
- langue
- from date (published_at_start)
- to date (published_at_end)