
# Apple Financial Performance

Let's build a very cool ETL process. We'll get data hourly from [financial modeling prep](https://financialmodelingprep.com). 

Our goal will be to get data every hour from this website and upload them into a DynamoDB Table 😀😀

✌️We'll do this in two steps: 

1. We'll get used to the API and create a DynamoDB table 
2. We'll create a DAG with Airflow that will be taking care of uploading the data hourly 

1. Import the follwing libraries: 
  * `requests` 
  * `boto3`
  * `datetime` 

In [None]:
import requests
import boto3
import datetime

ModuleNotFoundError: ignored

2. Go to the documentation and find a way to get Apple's stock price as well as rating
  * Here is the link to the documentation --> [API Documentation](https://financialmodelingprep.com/developer/docs/)

In [None]:
company_profile = requests.get('https://financialmodelingprep.com/api/v3/company/profile/AAPL?0cdaa5511c1f9a1d3bab49ca04824a4f')
company_profile.json()

{'profile': {'beta': '1.139593',
  'ceo': 'Timothy D. Cook',
  'changes': -0.82,
  'changesPercentage': '(-0.27%)',
  'companyName': 'Apple Inc.',
  'description': 'Apple Inc is designs, manufactures and markets mobile communication and media devices and personal computers, and sells a variety of related software, services, accessories, networking solutions and third-party digital content and applications.',
  'exchange': 'Nasdaq Global Select',
  'image': 'https://financialmodelingprep.com/images-New-jpg/AAPL.jpg',
  'industry': 'Computer Hardware',
  'lastDiv': '2.92',
  'mktCap': '1375375678560.00',
  'price': 297.84,
  'range': '142-233.47',
  'sector': 'Technology',
  'volAvg': '36724977',
  'website': 'http://www.apple.com'},
 'symbol': 'AAPL'}

In [None]:
company_rating = requests.get('https://financialmodelingprep.com/api/v3/company/rating/AAPL?0cdaa5511c1f9a1d3bab49ca04824a4f')
company_rating.json()

{'rating': {'rating': 'S', 'recommendation': 'Strong Buy', 'score': 5},
 'ratingDetails': {'D/E': {'recommendation': 'Strong Buy', 'score': 4},
  'DCF': {'recommendation': 'Buy', 'score': 4},
  'P/B': {'recommendation': 'Strong Buy', 'score': 5},
  'P/E': {'recommendation': 'Strong Buy', 'score': 5},
  'ROA': {'recommendation': 'Buy', 'score': 5},
  'ROE': {'recommendation': 'Strong Buy', 'score': 5}},
 'symbol': 'AAPL'}

In [None]:
stock_real_time = requests.get("https://financialmodelingprep.com/api/v3/stock/real-time-price/AAPL?0cdaa5511c1f9a1d3bab49ca04824a4f")
stock_real_time.json()

{'price': 297.78, 'symbol': 'AAPL'}

In [None]:
company_profile.json()["profile"]

{'beta': '1.139593',
 'ceo': 'Timothy D. Cook',
 'changes': -0.82,
 'changesPercentage': '(-0.27%)',
 'companyName': 'Apple Inc.',
 'description': 'Apple Inc is designs, manufactures and markets mobile communication and media devices and personal computers, and sells a variety of related software, services, accessories, networking solutions and third-party digital content and applications.',
 'exchange': 'Nasdaq Global Select',
 'image': 'https://financialmodelingprep.com/images-New-jpg/AAPL.jpg',
 'industry': 'Computer Hardware',
 'lastDiv': '2.92',
 'mktCap': '1375375678560.00',
 'price': 297.84,
 'range': '142-233.47',
 'sector': 'Technology',
 'volAvg': '36724977',
 'website': 'http://www.apple.com'}

3. Import Pandas and load the data you requested into DataFrames

In [None]:
### READ DATA
import pandas as pd

company_profile_df = pd.DataFrame(company_profile.json()["profile"], index=[0])
company_rating_df = pd.DataFrame(company_rating.json()["rating"], index=[0])

company_profile_df

Unnamed: 0,price,beta,volAvg,mktCap,lastDiv,range,changes,changesPercentage,companyName,exchange,industry,website,description,ceo,sector,image
0,297.84,1.139593,36724977,1375375678560.0,2.92,142-233.47,-0.82,(-0.27%),Apple Inc.,Nasdaq Global Select,Computer Hardware,http://www.apple.com,"Apple Inc is designs, manufactures and markets...",Timothy D. Cook,Technology,https://financialmodelingprep.com/images-New-j...


In [None]:
company_rating_df

Unnamed: 0,score,rating,recommendation
0,5,S,Strong Buy


In [None]:
company_profile_df.loc[:, ["price", "companyName"]]

Unnamed: 0,price,companyName
0,297.84,Apple Inc.


In [None]:
datetime.datetime.now().date().isoformat()

'2020-01-04'

4. Merge both DataFrames and add a columns `TimeStamp` that will be having today's timestamp in isoformat

👋👋 NB: Check [datetime documentation](https://docs.python.org/3/library/datetime.html#datetime.date.isoformat) to get today's date in isoformat

In [None]:
### MERGE Both datasets 

current_market_price = pd.concat([company_profile_df.loc[:, ["price", "companyName"]], company_rating_df.loc[:, ["score", "recommendation"]]], axis=1)
current_market_price["TimeStamp"] = datetime.datetime.now().isoformat()
current_market_price

Unnamed: 0,price,companyName,score,recommendation,TimeStamp
0,297.84,Apple Inc.,5,Strong Buy,2020-01-04T17:28:51.393887


5. Create a session with `boto3` that will store your `aws_access_key`, `aws_secret_access_key` and `region_name`

👋👋 NB: [Boto3 Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/session.html)

In [None]:
### CREATING BOTO3 SESSION
aws_session = boto3.Session(
     aws_access_key_id="AKIAIBFJF2BQDBNVXXWQ", 
     aws_secret_access_key = "g2uX8SOvYzrkaATAn9VJnlVjRccs1ItFN+HE3THW",
     region_name="us-east-1"
)


6. Create a `client` with boto3 with your session's credentials 

In [None]:
### CREATING DYNAMODB CLIENT
dynamodb_client = aws_session.client("dynamodb")

7. With boto3, create a Dynamodb Table that we'll call `apple_stock_prices`

In [None]:
### CREATING A DYNAMODB TABLE
dynamodb_client.create_table(
    AttributeDefinitions=[
                          {
                            "AttributeName":"Price",
                            "AttributeType":"N"                             
                          },
                          {
                              "AttributeName":"TimeStamp",
                              "AttributeType":"S"
                          }
    ],
    TableName="apple_stock_prices",
    KeySchema=[
               {
                   "AttributeName":"TimeStamp",
                   "KeyType": "HASH"
               },
               {
                   "AttributeName":"Price",
                   "KeyType": "RANGE"
               }
    ],
    BillingMode="PAY_PER_REQUEST"
)

{'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
   'content-length': '726',
   'content-type': 'application/x-amz-json-1.0',
   'date': 'Sat, 04 Jan 2020 16:56:23 GMT',
   'server': 'Server',
   'x-amz-crc32': '3408761962',
   'x-amzn-requestid': 'H08HMICA3SSVGJNFOBRGHVL4HNVV4KQNSO5AEMVJF66Q9ASUAAJG'},
  'HTTPStatusCode': 200,
  'RequestId': 'H08HMICA3SSVGJNFOBRGHVL4HNVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'RetryAttempts': 0},
 'TableDescription': {'AttributeDefinitions': [{'AttributeName': 'Price',
    'AttributeType': 'N'},
   {'AttributeName': 'TimeStamp', 'AttributeType': 'S'}],
  'BillingModeSummary': {'BillingMode': 'PAY_PER_REQUEST'},
  'CreationDateTime': datetime.datetime(2020, 1, 4, 16, 56, 23, 1000, tzinfo=tzlocal()),
  'ItemCount': 0,
  'KeySchema': [{'AttributeName': 'TimeStamp', 'KeyType': 'HASH'},
   {'AttributeName': 'Price', 'KeyType': 'RANGE'}],
  'ProvisionedThroughput': {'NumberOfDecreasesToday': 0,
   'ReadCapacityUnits': 0,
   'WriteCapacityUnits': 0},
 

8. Try to insert one item to see if test a way to load data into DynamoDB 

In [None]:
### Insert Data to Dynamodb 
response = dynamodb_client.put_item(
    TableName="apple_stock_prices",
    Item={
        "Price":{
            "N":'{}'.format(current_market_price.price[0])
        },
        "Score":{
            "N": "{}".format(current_market_price.score[0])
        },
        "Recommendation":{
            "S": current_market_price.recommendation[0]
        },
        "TimeStamp":{
            "S": current_market_price.TimeStamp[0]
        }
    }
)

9. Finally, create a DAG that will be doing all the above tasks automatically. 
  * Your DAG will need to run every hour
  * You should use `PythonOperator`
  * You might need to use [XComs](https://airflow.apache.org/docs/stable/concepts.html#xcoms) to make your DAG work 

In [None]:
### DAG : 

### STEP 1
from datetime import timedelta
import datetime
import requests
import pandas as pd

import boto3

import airflow
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator

### STEP 2 
# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime.datetime.now(),
    'email': ['admissions@jedha.co'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=0.5),
    # 'queue': 'bash_queue',
    # 'pool': 'backfill',
    # 'priority_weight': 10,
    # 'end_date': datetime(2016, 1, 1),
    # 'wait_for_downstream': False,
    # 'dag': dag,
    # 'adhoc':False,
    # 'sla': timedelta(hours=2),
    # 'execution_timeout': timedelta(seconds=300),
    # 'on_failure_callback': some_function,
    # 'on_success_callback': some_other_function,
    # 'on_retry_callback': another_function,
    # 'trigger_rule': u'all_success'
}

dag = DAG(
    'apple_financial_performance',
    default_args=default_args,
    description='Get apple stock hourly stock price to Dynamodb',
    schedule_interval="@hourly",
)


### STEP 2 

#### GET DATA
def get_apple_financial_data():

    ### API CALL
    company_profile = requests.get('https://financialmodelingprep.com/api/v3/company/profile/AAPL')
    company_rating = requests.get('https://financialmodelingprep.com/api/v3/company/rating/AAPL')

    ### READ DATA IN PANDAS
    company_profile_df = pd.DataFrame(company_profile.json()["profile"], index=[0])
    company_rating_df = pd.DataFrame(company_rating.json()["rating"], index=[0])

    ### MERGE Datasets 
    current_market_price = pd.concat([company_profile_df.loc[:, ["price", "companyName"]], company_rating_df.loc[:, ["score", "recommendation"]]], axis=1)
    current_market_price["TimeStamp"] = datetime.datetime.now().isoformat()

    return current_market_price


t1 = PythonOperator(
    task_id="import_data",
    python_callable=get_apple_financial_data,
    dag=dag
)


#### UPLOAD DATA 
def puller(**kwargs):
    ti = kwargs['ti']

    apple_df = ti.xcom_pull(task_ids="import_data")

    aws_session = boto3.Session(
     aws_access_key_id="AKIAIBFJF2BQDBNVXXWQ", 
     aws_secret_access_key = "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",
     region_name="us-east-1"
    )

    dynamodb_client = aws_session.client("dynamodb")

    response = dynamodb_client.put_item(
        TableName="apple_stock_prices",
        Item={
            "Price":{
                "N":'{}'.format(apple_df.price[0])
            },
            "Score":{
                "N": "{}".format(apple_df.score[0])
            },
            "Recommendation":{
                "S": apple_df.recommendation[0]
            },
            "TimeStamp":{
                "S": apple_df.TimeStamp[0]
            }
        }
    )
    

t2 = PythonOperator(
    task_id="load_data_into_dynamodb",
    python_callable=puller,
    provide_context=True,
    dag=dag
)


### STEP 3 

t1 >> t2 

Yay! 👏👏 Now you can just wait for your data to be filled up and eventually make some visualizations 😉😉