<a href="https://colab.research.google.com/github/isys5002-itp/isys5002-2023-semester2/blob/main/05_3_stocks_daily_return_Tue10AM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Module CSV

### Reading and Writing CSV file

Text files work fine when we are referencing small amounts of information, but when we use larger amounts of data, adding structure helps in organizing and retrieving values.

One common format found in business and social sciences alike (as well as any field concerned with data science) is the comma-separated values (CSV) format.

**CSV files** are the most common format used for importing and exporting data from spreadsheets and databases.

CSV files are text files that have delimiters.  A **delimiter** is a character that separates data values.

You can explore CSV files in spreadsheet software (such as Microsoft Excel), which will remove delimiters (usually commas) and store data values in separate cells.


One of the benefits of importing data files such as CSV files is the ability to read in a lot of data at once, parsing the data so your code can access individual values within the data. **By default, CSV files use commas (“,”) to separate data values**.



## Working with large CSV file

In [None]:
!curl "https://query1.finance.yahoo.com/v7/finance/download/NAB.AX?period1=1653004800&period2=1661040000&interval=1d&events=history&includeAdjustedClose=true" > NABData.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  4531  100  4531    0     0  19995      0 --:--:-- --:--:-- --:--:-- 20048


In [None]:
import csv

#open file and write records
with open("NABData.csv", 'r')  as data:
  reader = csv.reader(data)
  print(next(reader))
  print(next(reader))
  print(next(reader))


In [None]:
import csv

# initialize two empty lists, date and close, where the extracted data will be stored later
date = []
close = []

# open file and write records
with open("NABData.csv", 'r')  as data: # opens the file named "NABData.csv" in read mode ('r')

  # creates a CSV reader object reader by passing the file object data to the csv.reader() function.
  # This reader will allow us to iterate through the rows of the CSV file.
  reader = csv.reader(data)
  header = next(reader, None) # reads the first row of the CSV file using the next() function, which advances the reader to the next row. This is often done to skip the header row that contains column names.
  #print(header)
  for record in reader:
    date.append(record[0]) # extracts the value from the first column of the current record and appends it to the date list
    close.append(float(record[5])) # extracts the value from the sixth column of the current record, converts it to a floating-point number using the float() function, and then appends it to the close list

# plot the results
from matplotlib import pyplot as plt
f = plt.figure()
f.set_figwidth(20)
f.set_figheight(5)
plt.xticks(rotation=45)
plt.plot(date, close)


# Module Pandas

## Using Pandas Module to Read in CSV Files (the “Easy Way”)

An easier method of importing files into Python is using the **Pandas** module. Pandas (name derived from “panel data”) is a data analysis library that, among other things, makes reading in CSV files and accessing the contained data much easier

The primary data structure used in Pandas are a ***DataFrame***. A DataFrame has a *two-dimensional tabular format using rows and columns*. Using a DataFrame, we can reference columns by name, rather than having to count to figure out which column number we want. Pandas provides an assortment of methods like .mean() that will do summary statistics on our data


Read From a CSV File Using Pandas

Source: *Kaefer, F., & Kaefer, P. (2020). Introduction to Python
Programming for Business and Social Science Applications. SAGE Publications, Inc. (US).*

In [None]:
!curl "https://query1.finance.yahoo.com/v7/finance/download/BHP.AX?period1=1629553321&period2=1661089321&interval=1d&events=history&includeAdjustedClose=true" > BHPData.csv

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17550    0 17550    0     0  54049      0 --:--:-- --:--:-- --:--:-- 54166


In [None]:
#import pandas module
import pandas as pd
from matplotlib import pyplot as plt

# Load the data
df = pd.read_csv('BHPData.csv')
print(df.head())

# creat a new data frame that contains only Date and Adj Close columns
df_data = df[['Date', 'Adj Close']]
print(df_data.head())

plt.plot(df_data['Date'], df_data['Adj Close'])


### Operations on data - E.g.: Calculate daily return

```
Daily return = (Today prices ) / Yesterday price) - 1
```

For example, if we have an initial value of
110, the daily return would be (110/100) - 1 = 0.10 or 10%. By subtracting 1 from the result, we get the proportion of the change as a decimal value, which can then be multiplied by 100 to get the percentage change.

In finance, daily returns are usually expressed as a percentage change in the value of an asset, which is why we subtract 1 from the result to get the percentage change.

## Daily Return of Stock Data

### Daily Return formula

Visiting a financial website that provides stock price information. Type a company’s name or its stock’s ticker symbol. Find in the historical prices section the stock’s closing price for any two consecutive days. For example, assume a stock’s closing price was \$36.75 yesterday and that its closing price was \$35.50 the previous day. Subtract the previous day’s closing price from the most recent day’s closing price. In this example, subtract \$35.50 from \$36.75 to get \$1.25.

Now divide the results by the previous day's closing prices to calculate the daily return.  Multiply this by 100 to convert to a percentage.  So $1.25 divided by #35.50 equals 0.035.  Multiply 0.035 by 100 to 3.5 percent.

    Daily return = (Today prices - Yesterday price) / Yesterday price



To find the URL, navigate to Yahoo finance, type in NAB.AX, click the 'Historical' tab.  towards the right is a download option.  right click on the download link and select 'Copy Link Address' form the popup menu.  Now paste to the URL assignment statement.

Sources:

*Kaefer, F., & Kaefer, P. (2020). Introduction to Python
Programming for Business and Social Science Applications. SAGE Publications, Inc. (US).*

*Tony Gaddis, Starting out with Python, 5th Edition*

*Deitel & Deitel, Intro to Python for Computer Science and Data Science, Global Edition*
