# Instructions

1. **Make a copy of this notebook** so you can edit and save your own version of it. Optionally update the notebook title to include the Net IDs of all group members. Do the work in your copy of the notebook. 

2. Edit the sharing settings: **share your notebook** publicly, so **"Anyone with the link can view"**. Notebooks not shared properly may be subject to late deductions if an instructor is unable to access them at the time of grading.

3. **Run the Setup Cells** in the "Setup" section.

4. **Complete the challenges**. For each challenge:
    + **Run any additional provided "Setup" cell(s)**, as necessary.
    + **Write Python code** to answer each of the provided questions. 






# Submission Instructions



To review your notebook / ensure it works as expected / prepare for evaluation: 
  1. Run it from scratch ("Runtime" > "Restart and run all"), provide any necessary user inputs, and verify you see the results you expect.


When you're done coding and your notebook reflects your final work product, follow these steps to submit:

  1. Download a copy of your notebook document in .ipynb format ("File" > "Download" > "Download .ipynb").
  2. Upload the resulting .ipynb notebook file to Canvas.
  3. Also submit your Colab Notebook URL via the [Submission Form](https://forms.gle/gWHLVjN2XVmqCogm7). 


> NOTE: only one member needs to submit on behalf of the group.

# Evaluation

Deliverables will be evaluated based on the criteria below:

Category | Weight
--- | ---
Challenge 1 | 18%
Challenge 2 | 12%
Challenge 3 | 15%
Challenge 4 | 20%
Challenge 5 | 20%
Challenge 6 | 15%

This rubric is tentative, and may be subject to adjustments during the grading process.

# Setup

## Imports



Here are some modules and packages we may be using to complete various aspects of this exercise. Feel free to adjust the alias strategy to meet your preference, and/or import additional modules and packages as desired.

In [90]:
#
# IMPORTS (RUN THIS CELL, AND FEEL FREE TO MODIFY AS DESIRED)
#

import numpy as np

import pandas as pd
#from pandas import read_csv, read_excel, to_datetime, DataFrame

import plotly.express as px
# from plotly.express import line, scatter


## Helper Function (USD Formatting)



The function below can be used to convert a number into a dollar-sign formatted string. 

Run the cell, and feel free to use / invoke this function later as desired, when answering questions in Challenges 4-6.

In [91]:
#
# SETUP CELL (RUN THIS CELL, AND DO NOT MODIFY)
#

def to_usd(my_price):
    """
        Converts a numeric value to USD-formatted string, for printing and display purposes.
        Adds dollar sign and commas for the thousands separator.
        Rounds to two decimal places. 
        
        Param: my_price (int or float or str) like 4000.444444 or "4000.444444"
        
        Example: to_usd(4000.444444)
        
        Returns: $4,000.44
    """
    return f"${float(my_price):,.2f}" 


In [None]:
# example invocations:
print(to_usd(4.5))
print(to_usd(1234567890.12345))

$4.50
$1,234,567,890.12


# Challenges


## Challenge 1 (Fibonacci Sequence)



The **Fibonacci sequence** is a set of numbers where following the first two numbers of the sequence (0 and 1), each successive number is the sum of the prior 2 numbers. 

...

A) Write / **define a function called `fibo`** that prints the first N digits of the sequence, where N is an input into the function. For example, if 10 is provided as an input, then the sequence / list of numbers {0, 1, 1, 2, 3, 5, 8, 13, 21, 34} is printed. The function should also return the sequence / list of numbers.

> NOTE: no need to get fancy - our goal is just for the function to produce the desired output.

> HINT: consider looping through a `range` and using conditional logic

B) If a negative number is passed in, the function should gracefully handle the invalid input by printing an error message like "OOPS PLEASE PASS A POSITIVE INTEGER", and not attempting to continue processing that input. 

C) Add a docstring to your function definition. The docstring should describe the function's purpose, as well as provide information about the parameter and return values. Optionally also add type hints to your function definition, if you like that kind of thing.



In [11]:

# DEFINE THE FUNCTION HERE

def fibo(N):
  fib = [0 , 1]
  if N <= 0:
    return ["OOPS PLEASE PASS A POSITIVE INTEGER"]
  if N == 1:
    return [0]
  if N == 2:
    return fib
  for x in range(2, N):
    fib.append(fib[x-1] + fib[x-2])
  return fib
  
# Create a list through fib which details the first two numbers of the fibonacci sequence
# If N is equal to or less than 0 then the fuction needs to return "OOPS PLEASE PASS A POSITIVE INTEGER"
# Then if N is 1, return the first number of the sequence
# Then if N equals two then return the fib list of 0 and 1
# Then run the function from 2 to what N is
# We then use the append function to put it all together by adding the sum of the last two numbers to the list
# Lastly return the list
# It changes once you enter 3 into the function, with 3, the function grabs the number from the prior two spots in the list, sums them,
# and then adds it on to that place in the list
# So for the third spot in the list it will sum the two previous numbers in the list







In [8]:
fibo(2)

[0, 1]

In [10]:
fibo(1)

[0]

In [12]:
fibo(-15)

['OOPS PLEASE PASS A POSITIVE INTEGER']

## Challenge 2 (Simulations)


Use capabilities of the `numpy` package to answer all the following questions:


A) Write a line of code that prints all odd numbers between 0 and 100. Do not use a loop to do so.

B) Generate 10,000 random numbers that conform to a "normal" distribution, where the mean is 0 and the standard deviation is 0.15.

C) Using the normally distributed numbers from Part B, print the following summary statistics:
  + mean 
  + median
  + minimum
  + maximum
  + 25th percentile / quantile value 
  + 75th percentile / quantile value


In [17]:
# https://numpy.org/doc/stable/reference/generated/numpy.arange.html
# https://stackoverflow.com/questions/41638751/filtering-even-numbers-in-python

print(np.arange(1 , 100 , 2))

[ 1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95
 97 99]


In [24]:
# https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html

a = np.random.normal(0 , 0.15 , 10000)
print(a)

[ 0.01787104 -0.05180772  0.12019143 ...  0.03902514  0.15723893
 -0.09421446]


In [31]:
# https://numpy.org/doc/stable/reference/generated/numpy.median.html
# https://numpy.org/doc/stable/reference/generated/numpy.quantile.html

print(np.median(a))
print(np.mean(a))
print(np.min(a))
print(np.max(a))
print(np.quantile(a , .25))
print(np.quantile(a , .75))

0.0015896457344820505
0.0009757943813672286
-0.5661565574654915
0.5379272378506524
-0.10290337429912177
0.10375075458723833


## Challenge 3 (SEC Filings)


Given the Python variable called `filings_index` provided below, write Python code which references that variable to perform the following tasks...


A) Convert this data into a **more usable data structure**: a list of dictionaries, where each dictionary has the following keys: `"cik"`, `"company_name"`, `"form_type"`, `"date_filed"`, and `"filename"`. 


> HINT: try the string's [`split()`](https://docs.python.org/3/library/stdtypes.html#str.split) or `splitlines()` method to convert the string to a list of lines

> HINT: new line characters are represented by `"/n"`.

B) Print the **number of filings** total (i.e. `21`).

C) Print the **number of filings** filed by "AMAZON COM INC" (i.e. `4`).

D) Print the **file name** of the "10-K" form filed by "AMAZON COM INC" (i.e. `"edgar/data/1018724/0001018724-19-000004.txt"`).


E) Use string concatenation or a format string to assemble the full URL of this filing, by joining the file name from part D to the end of the provided `ARCHIVES_URL` variable (i.e. `"https://www.sec.gov/Archives/edgar/data/1018724/0001018724-19-000004.txt"`). Store this **filing url** in a variable called `filing_url` and print it. 

> NOTE: you should be able to visit this url in a browser to verify there is a filing there.



In [68]:
# SETUP CELL (RUN AND DO NOT MODIFY)

ARCHIVES_URL = "https://www.sec.gov/Archives"

# triple quotes is a multi-line string:
filings_index = """
Description:           Master Index of EDGAR Dissemination Feed
Last Data Received:    March 31, 2019
Comments:              webmaster@sec.gov
Anonymous FTP:         ftp://ftp.sec.gov/edgar/
Cloud HTTP:            https://www.sec.gov/Archives/

CIK|Company Name|Form Type|Date Filed|Filename
--------------------------------------------------------------------------------
1000045|NICHOLAS FINANCIAL INC|10-Q|2019-02-14|edgar/data/1000045/0001193125-19-039489.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-01-15|edgar/data/1000045/0001357521-19-000001.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-02-19|edgar/data/1000045/0001357521-19-000002.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-03-15|edgar/data/1000045/0001357521-19-000003.txt
1000045|NICHOLAS FINANCIAL INC|8-K|2019-02-01|edgar/data/1000045/0001193125-19-024617.txt
1000148|LEGACY CAPITAL FUND, INC.|X-17A-5|2019-02-21|edgar/data/1000148/9999999997-19-000997.txt
1000177|NORDIC AMERICAN TANKERS Ltd|424B3|2019-03-29|edgar/data/1000177/0000919574-19-002650.txt
1000177|NORDIC AMERICAN TANKERS Ltd|6-K|2019-02-12|edgar/data/1000177/0000919574-19-000990.txt
1018724|AMAZON COM INC|10-K|2019-02-01|edgar/data/1018724/0001018724-19-000004.txt
1018724|AMAZON COM INC|8-K|2019-01-31|edgar/data/1018724/0001018724-19-000002.txt
1018724|AMAZON COM INC|8-K|2019-02-04|edgar/data/1018724/0001018724-19-000008.txt
1018724|AMAZON COM INC|8-K|2019-02-25|edgar/data/1018724/0001193125-19-049914.txt
101872|FIRST VARIABLE ANNUITY FUND A|24F-2NT|2019-03-25|edgar/data/101872/0000101872-19-000003.txt
101872|FIRST VARIABLE ANNUITY FUND A|N-CEN|2019-03-11|edgar/data/101872/0001713935-19-000052.txt
1018735|NYMOX PHARMACEUTICAL CORP|20-F|2019-03-29|edgar/data/1018735/0001640334-19-000464.txt
1018735|NYMOX PHARMACEUTICAL CORP|4|2019-01-03|edgar/data/1018735/0001682873-19-000001.txt
1018735|NYMOX PHARMACEUTICAL CORP|4|2019-01-03|edgar/data/1018735/0001682873-19-000002.txt
1018825|IBM Retirement Fund|13F-HR|2019-02-14|edgar/data/1018825/0001018825-19-000001.txt
1018840|ABERCROMBIE & FITCH CO /DE/|3|2019-02-06|edgar/data/1018840/0001225208-19-001840.txt
1018840|ABERCROMBIE & FITCH CO /DE/|3|2019-02-06|edgar/data/1018840/0001225208-19-001841.txt
1018840|ABERCROMBIE & FITCH CO /DE/|4|2019-02-06|edgar/data/1018840/0001225208-19-001842.txt"""

print(filings_index)


Description:           Master Index of EDGAR Dissemination Feed
Last Data Received:    March 31, 2019
Comments:              webmaster@sec.gov
Anonymous FTP:         ftp://ftp.sec.gov/edgar/
Cloud HTTP:            https://www.sec.gov/Archives/

CIK|Company Name|Form Type|Date Filed|Filename
--------------------------------------------------------------------------------
1000045|NICHOLAS FINANCIAL INC|10-Q|2019-02-14|edgar/data/1000045/0001193125-19-039489.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-01-15|edgar/data/1000045/0001357521-19-000001.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-02-19|edgar/data/1000045/0001357521-19-000002.txt
1000045|NICHOLAS FINANCIAL INC|4|2019-03-15|edgar/data/1000045/0001357521-19-000003.txt
1000045|NICHOLAS FINANCIAL INC|8-K|2019-02-01|edgar/data/1000045/0001193125-19-024617.txt
1000148|LEGACY CAPITAL FUND, INC.|X-17A-5|2019-02-21|edgar/data/1000148/9999999997-19-000997.txt
1000177|NORDIC AMERICAN TANKERS Ltd|424B3|2019-03-29|edgar/data/1000177/0000919574-1

In [65]:
print(type(filings_index))

<class 'str'>


In [72]:
print("------------------")
print("PROCESSING SEC FILINGS...")
print("------------------")


# todo: write some python here

# The strip function removes all before and after whitespace from the index while the splitline divides the total string into series of strings
# [8:] removes the first 8 rows of the index since they are useless
filings_list = []
filings_string = filings_index.strip().splitlines()[8:]

# Next you take each filing in the filing string and split it where there is a "|" which then breaks each filing into a list of 5 total varaibles
# Each variable is then assigned to a certain class: first data is "cik" and so on until the dictionary is created
# Then the dictionary is added to the filings list
for filing in filings_string:
  filing_data = filing.split("|")
  filing_dict = {
    "cik": filing_data[0],
    "company_name": filing_data[1],
    "form_type": filing_data[2],
    "date_filed": filing_data[3],
    "filename": filing_data[4]
  }
  filings_list.append(filing_dict)




------------------
PROCESSING SEC FILINGS...
------------------


In [73]:
filings_count = len(filings_list)  # get the number of elements in the list
print(filings_count)

21


In [74]:
# creates a name for the amazon filings where the filings have "amazon" in the string
amazon_filings = [filing for filing in filings_list if filing["company_name"] == "AMAZON COM INC"]
# counts number of instancese of amazon filings found
amazon_filings_count = len(amazon_filings)
print(amazon_filings_count)

4


In [75]:
for filing in amazon_filings:
  if filing["form_type"] == "10-K":
    amazon_10k_filing = filing["filename"]  # get the filename of the 10-K filing
    # break function stops the funtion from further pursuing the loop
    break
print(amazon_10k_filing)

edgar/data/1018724/0001018724-19-000004.txt


In [76]:
filing_url = ARCHIVES_URL + "/" + amazon_10k_filing  # concatenate the URL and the filename
print(filing_url)

https://www.sec.gov/Archives/edgar/data/1018724/0001018724-19-000004.txt


In [77]:
print(filings_list)


[{'cik': '1000045', 'company_name': 'NICHOLAS FINANCIAL INC', 'form_type': '10-Q', 'date_filed': '2019-02-14', 'filename': 'edgar/data/1000045/0001193125-19-039489.txt'}, {'cik': '1000045', 'company_name': 'NICHOLAS FINANCIAL INC', 'form_type': '4', 'date_filed': '2019-01-15', 'filename': 'edgar/data/1000045/0001357521-19-000001.txt'}, {'cik': '1000045', 'company_name': 'NICHOLAS FINANCIAL INC', 'form_type': '4', 'date_filed': '2019-02-19', 'filename': 'edgar/data/1000045/0001357521-19-000002.txt'}, {'cik': '1000045', 'company_name': 'NICHOLAS FINANCIAL INC', 'form_type': '4', 'date_filed': '2019-03-15', 'filename': 'edgar/data/1000045/0001357521-19-000003.txt'}, {'cik': '1000045', 'company_name': 'NICHOLAS FINANCIAL INC', 'form_type': '8-K', 'date_filed': '2019-02-01', 'filename': 'edgar/data/1000045/0001193125-19-024617.txt'}, {'cik': '1000148', 'company_name': 'LEGACY CAPITAL FUND, INC.', 'form_type': 'X-17A-5', 'date_filed': '2019-02-21', 'filename': 'edgar/data/1000148/9999999997-

## Challenge 4 (Stock Prices Dictionary)



Given the Python variable called `stock_data` provided below, write Python code which references that variable to perform each of the following tasks...

A) Print the **selected stock symbol** (i.e. `"MSFT"`). 


B) Assuming the latest day will always be listed first, print the **latest available day** (i.e. `"2030-03-16"`). 



C) Assuming the latest day will always be listed first, print the stock's **latest closing price**, formatted as USD with dollar sign and two decimal places (i.e. `"$237.71"`) 

> HINT: use the `to_usd()` function provided via setup cell


D) Assemble and print a new **list of closing prices**, with the prices listed in ASCENDING order of their date (e.g. `[232.42, 237.13, 235.75, 234.81, 237.71]` ). 


> HINT: in your sorting operation, try leveraging the `operator` module's `itemgetter` function, or your own custom function



In [97]:
# SETUP CELL (RUN AND DO NOT MODIFY)

stock_data = {
    "Meta Data": {
        "1. Information": "Daily Prices (open, high, low, close) and Volumes",
        "2. Symbol": "MSFT",
        "3. Output Size": "Compact",
        "4. Time Zone": "US/Eastern"
    },
    "Time Series (Daily)": {
        "2030-03-16": {
            "1. open": "236.2800",
            "2. high": "240.0550",
            "3. low": "235.9400",
            "4. close": "237.7100",
            "5. volume": "28092196"
        },
        "2030-03-15": {
            "1. open": "234.9600",
            "2. high": "235.1850",
            "3. low": "231.8100",
            "4. close": "234.8100",
            "5. volume": "26042669"
        },
        "2030-03-12": {
            "1. open": "234.0100",
            "2. high": "235.8200",
            "3. low": "233.2300",
            "4. close": "235.7500",
            "5. volume": "22653662"
        },
        "2030-03-11": {
            "1. open": "234.9600",
            "2. high": "239.1700",
            "3. low": "234.3100",
            "4. close": "237.1300",
            "5. volume": "29907586"
        },
        "2030-03-10": {
            "1. open": "237.0000",
            "2. high": "237.0000",
            "3. low": "232.0400",
            "4. close": "232.4200",
            "5. volume": "29746812"
        }
    }
}



In [82]:
print("------------------")
print("PROCESSING STOCK DATA...")
print("------------------")

# todo: write some python here

MSFT_stock_symbol = stock_data["Meta Data"]["2. Symbol"]
print(MSFT_stock_symbol)


------------------
PROCESSING STOCK DATA...
------------------
MSFT


In [93]:
last_day = list(stock_data["Time Series (Daily)"].keys())[0]
print(last_day)

2030-03-16


In [99]:
latest_close = stock_data["Time Series (Daily)"][last_day]["4. close"]
formatted_latest_close = to_usd(float(latest_close))
print(formatted_latest_close)

$237.71


In [103]:
dates = list(stock_data["Time Series (Daily)"].keys())

# sort dates in ascending order
dates.sort()

# create empty list to store closing prices
closing_prices = []

# iterate through dates and get corresponding closing price
for date in dates:
  closing_price = float(stock_data["Time Series (Daily)"][date]["4. close"])
  closing_prices.append(closing_price)

# print sorted closing prices
print(closing_prices)

[232.42, 237.13, 235.75, 234.81, 237.71]


## Challenge 5 (Stocks Prices CSV File)

After running the provided setup cells below, use the provided `nflx_df` variable to answer the following questions:

A) Assuming each row represents a different date, print the **number of rows** or dates available (i.e. `100`).

B) Print the **column names** (i.e. `['timestamp', 'open', 'high', 'low', 'close', 'adjusted_close', 'volume', 'dividend_amount', 'split_coefficient']`).

C) Print the **earliest date** of available data (i.e. `"2021-05-27"`). We can assume it will be provided in the last row. 

D) Print the **latest date** of available data (i.e. `"2021-10-18"`). We assume it will be provided in the first row.

E) Print the **latest adjusted closing price**, formatted as USD (i.e. `"$637.97"`).

<hr>

> DATA NOTE: only the adjusted close column is scaled properly to account for stock splits, but in this case there are no stock splits (as denoted by all ones in the "split_coeficient" column), so we can feel free to calculate the answers for the next two questions using the "high" and "low" columns respectively, and it will be methodologically sound


F) Print the **highest high price**, formatted as USD (i.e. "$646.84"). This is the maximum of all high prices.

G) Print the **lowest low price**, formatted as USD (i.e. "$482.14"). This is the minimum of all low prices.


In [104]:

#
# SETUP CELL (RUN AND DO NOT MODIFY)
#

nflx_file_url = "https://raw.githubusercontent.com/prof-rossetti/intro-to-python/main/data/daily_adjusted_nflx.csv"
print("RAW CSV FILE URL:", nflx_file_url)

nflx_df = pd.read_csv(nflx_file_url)
nflx_df.head()


RAW CSV FILE URL: https://raw.githubusercontent.com/prof-rossetti/intro-to-python/main/data/daily_adjusted_nflx.csv


Unnamed: 0,timestamp,open,high,low,close,adjusted_close,volume,dividend_amount,split_coefficient
0,2021-10-18,632.1,638.41,620.5901,637.97,637.97,4669071,0.0,1.0
1,2021-10-15,638.0,639.42,625.16,628.29,628.29,4116874,0.0,1.0
2,2021-10-14,632.23,636.88,626.79,633.8,633.8,2672535,0.0,1.0
3,2021-10-13,632.1791,632.1791,622.1,629.76,629.76,2424638,0.0,1.0
4,2021-10-12,633.02,637.655,621.99,624.94,624.94,3227349,0.0,1.0


In [105]:
# FYI all split coefs are equal to 1: 
print("SPLITS:", nflx_df["split_coefficient"].unique())

SPLITS: [1.]


In [106]:

fig = px.line(nflx_df, x="timestamp", y="adjusted_close", title="Stock Prices (NFLX)")
fig.show()

In [None]:

# TODO: write some code here to answer the questions


In [107]:
num_dates = nflx_df.shape[0]
print(num_dates)

100


In [108]:
column_names = nflx_df.columns
print(column_names)

Index(['timestamp', 'open', 'high', 'low', 'close', 'adjusted_close', 'volume',
       'dividend_amount', 'split_coefficient'],
      dtype='object')


In [115]:
earliest_date_df = nflx_df.sort_values(by='timestamp').iloc[0]
earliest_date = earliest_date_df['timestamp']
print(earliest_date)

2021-05-27


In [110]:
latest_date_df = nflx_df.sort_values(by='timestamp', ascending=False).iloc[0]
latest_date = latest_date_df['timestamp']
print(latest_date)

2021-10-18


In [111]:
latest_adj_close_df = nflx_df.sort_values(by='timestamp', ascending=False).iloc[0]
latest_adj_close = latest_adj_close_df['adjusted_close']
formatted_latest_adj_close = "${:,.2f}".format(latest_adj_close)
print(formatted_latest_adj_close)

$637.97


In [112]:
highest_high = nflx_df['high'].max()
formatted_highest_high = "${:,.2f}".format(highest_high)
print(formatted_highest_high)

$646.84


In [113]:
lowest_low = nflx_df['low'].min()
formatted_lowest_low = "${:,.2f}".format(lowest_low)
print(formatted_lowest_low)

$482.14


2021-05-27


## Challenge 6 (Stock Prices Real-time)

After running the provided setup cells below, use the provided `realtime_df` variable to answer the following questions:

A) Assuming each row represents a different date, print the **number of rows** or dates available (e.g. `4625`, but will likely differ depending on when you run the code).

B) Print the **column names** (i.e. `['timestamp', 'open', 'high', 'low', 'close', 'adjusted_close', 'volume', 'dividend_amount', 'split_coefficient']`).

C) Print the **earliest date** of available data (i.e. `"2004-08-19"`). We can assume it will be provided in the last row. 

D) Print the **latest date** of available data (e.g. `"2022-12-30"`, but will likely differ depending on when you run the code). We assume it will be provided in the first row.

E) Print the **latest adjusted closing price**, formatted as USD (e.g. `"$88.23"`, but will likely differ depending on when you run the code).

<hr>

> DATA NOTE: only the adjusted close column is scaled properly to account for stock splits. so taking the highest high or lowest low would not be methodologically sound unless we first scale the highs and lows based on the split coefficient. No need to worry about this for now. We may return to this task during the next unit.

<hr>

F) **Save / export** this `realtime_df` to a CSV file. The filename should include the stock symbol in lowercase (i.e. `"latest_prices_googl.csv"`). Specifically use a format string or string concatenation to compile the filename using the provided `symbol` variable. 

> NOTE: after saving, you should be able to view the file using the Colab filesystem menu in the left sidebar. You could then optionally download the file and inspect in spreadsheet software, as desired.

In [131]:
#
# SETUP CELL (RUN AND DO NOT MODIFY)
#

from getpass import getpass

# copy one of the professor's premium keys from Canvas, 
# run this cell and paste it when prompted
# avoid printing or otherwise exposing this secret credential, 
# as that will incur a major security deduction!
API_KEY = getpass("Please input your Alphavantage API KEY: ") 

Please input your Alphavantage API KEY: ··········


In [132]:
#
# SETUP CELL (RUN AND DO NOT MODIFY)
#

symbol = "GOOGL"

# see API Docs for daily adjusted stock data: https://www.alphavantage.co/documentation/#dailyadj
# notes: 
# ... we are using the "format=csv" URL parameter to request CSV formatted data
# ... we are using the "outputsize=full" URL parameter to request all-time data
# ... we are passing in a "symbol" parameter as well, which in this case happens to be "GOOGL", but theoretically could be any stock of interest
request_url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol={symbol}&apikey={API_KEY}&datatype=csv&outputsize=full"
# avoid seeing a printed version of the url in your final deliverable, 
# as that would expose the secret credential API Key value

realtime_df = pd.read_csv(request_url)
realtime_df.head()

Unnamed: 0,timestamp,open,high,low,close,adjusted_close,volume,dividend_amount,split_coefficient
0,2023-01-06,86.79,87.69,84.86,87.34,87.34,41381495,0.0,1.0
1,2023-01-05,87.47,87.57,85.9,86.2,86.2,27194375,0.0,1.0
2,2023-01-04,90.35,90.65,87.271,88.08,88.08,34854776,0.0,1.0
3,2023-01-03,89.585,91.05,88.52,89.12,89.12,28131224,0.0,1.0
4,2022-12-30,86.98,88.3,86.57,88.23,88.23,23986297,0.0,1.0


In [133]:

fig = px.line(realtime_df, x="timestamp", y="adjusted_close", 
           title=f"Daily Adjusted Closing Prices ({symbol})",
           labels={"timestamp": "Date", "adjusted_close": "Adjusted Closing Price (USD)"},
)
fig.show()

In [None]:

# TODO: write some code here to answer the questions



In [134]:
print(f"Number of rows: {realtime_df.shape[0]}")

Number of rows: 4629


In [128]:
print(f"Column names: {realtime_df.columns.tolist()}")

Column names: ['timestamp', 'open', 'high', 'low', 'close', 'adjusted_close', 'volume', 'dividend_amount', 'split_coefficient']


In [None]:
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html


In [124]:
earliest_date = realtime_df.tail(1)['timestamp'].iloc[0]
print(f"Earliest date: {earliest_date}")

Earliest date: 2004-08-19


In [129]:
latest_date = realtime_df.head(1)['timestamp'].iloc[0]
print(f"Latest date: {latest_date}")

Latest date: 2023-01-06


In [130]:
latest_price = realtime_df.head(1)['adjusted_close'].iloc[0]
formatted_price = "${:,.2f}".format(latest_price)
print(f"Latest adjusted closing price: {formatted_price}")

Latest adjusted closing price: $87.34
