# Extracting data from TOI API
1. Create an API key for Times Of India API.
2. Use that API to create queries.
3. Fetch the recent articles published about "business analytics" from all sources.
4. For each article extract the following things – [source-id , source-name, author, title, description, content]
5. Convert the JSON output to dataframe with each parameter as a column and each article as a row.
6. Save the output file in csv format("output_api.csv").

# Solution

### Step1
#### Secure the API Key
* Visit the [site](https://newsapi.org/s/the-times-of-india-api) to secure the API Key
>![NOTE] You will have to create an account to secure the API Key

### Step2 
#### Install the required packages
* Open Anaconda command prompt in administrator mode and execute the following script `pip install requests` and `pip install newsapi-python`

### Step 3
#### Import the required packages
* Import the recently installed `requests`, `newsapi` package along with `json`
* These packages make it easier to work with the JSON data format and the HTTP protocol

In [2]:
import json as js
import requests as rq
import pandas as pd
from newsapi import NewsApiClient
from pandas.io.json import json_normalize

### Step 4
#### Define common variables
* In this example we are using the client library provided by newsapi to form and execute the **HTTP** request.
* From the JSON object received, we parse individual elements and poplutae a list
* The list is then converted to data frame 
* Finally the data frame is written back to the csv file

In [4]:
try:
    listData = []
    file_name = 'output_api.csv'
    # Init
    newsapi = NewsApiClient(api_key='your api key') #replace this with your API key

    # /v2/business analytics in timees of india
    all_articles  = newsapi.get_everything(q='business analytics',
                                          sources='the-times-of-india',
                                          domains='timesofindia.indiatimes.com',
                                          language='en',
                                          sort_by='publishedAt')

    for article in all_articles["articles"]:
        source = article["source"]
        try:
            sourceId = source["id"]
        except:
            sourceId = ""
        try:
            sourceName = source["name"] 
        except:
            sourceName = ""
        try:
            author = article["author"] 
        except:
            author = "" 
        try:
            title = article["title"] 
        except:
            title = ""
        try:
            description = article["description"] 
        except:
            description = ""
        try:
            content = article["content"] 
        except:
            contect = ""
        
        listData.append((sourceId, sourceName, author, title, description, content))

    cols=['sourceId','sourceName','author','title','description','content']
    articleData = pd.DataFrame(listData, columns=cols)
    articleData.to_csv(file_name, sep=',', encoding='utf-8')

except Exception as e:
    print(e)

### Step 5
#### Validate the file is created with relevant records
* Read the recently created csv file and insert into a data frame
* Read the top few records from the data frame

In [None]:
data = pd.read_csv('output_api.csv', sep=',', encoding='utf-8') 
data.head()