<a href="https://colab.research.google.com/github/hyunah3105/ISYS5002/blob/main/05_Working_with_Files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **File Input & Output**
For program to retain data between the times it is run, you must save the data
- Data is saved to a file, typically on computer disk
- Saved data can be retrieved and used at a later time

“***Writing data to***”: saving data on a file
<br>***Output file***: a file that data is written to
<br>*“**Reading data from**”*: process of retrieving data from a file
<br>***Input file***: a file from which data is read

Three steps when a program uses a file
- Open the file
- Process the file
- Close the file

**Types of files**
<br>
- A **text file** is a sequence of characters
- A **binary file** (for images, videos and more) is a sequence of bytes
- First character in a text file or byte in a binary file is located at position 0
  - In a file of n characters or bytes, the highest position number is n – 1 (**end-of-file marker**)
- For each file you **open**, Python creates a **file object** that you’ll use to interact with the file


**Comma-separated values (CSV) file**
- CSV files are the most common format used for importing and exporting data from spreadsheets and databases.
- CSV files are text files that have delimiters. A delimiter is a character that separates data values.



# Text File

## File `open` Function

`file_variable = open(filename, mode)`

**Mode**: string specifying how the file will be opened

Example: reading only ('r'), writing ('w'), and appending ('a')


https://www.w3schools.com/python/python_file_handling.asp

In [1]:
#open a file named customers.txt
file = open("customers.txt","w")

### Writing Text File

In [17]:
### Create a text file named accounts.txt for writing ###

#open file in write mode
file = open("accounts.txt","w")

#write three customers names into the file
file.write("David Jones\n")
file.write("Hanna Noh\n")
file.write("Taylor Swift")
#writes the items of a list to the file
file.writelines("Katy Perry\nJeniffer")



#close the file
file.close()



### Reading Text File

In [14]:
### Reading data from a text file - customers.txt ###

#open file in read mode
file = open("accounts.txt",'r')

#print the records read from the file
print(file.readline())
print(file.readline())
#print each line iteratively
for row in file:
  print(row)

#close file
file.close()


David Jones

Hanna Noh

Taylor Swift


### Writing & Reading with the `with` statement

- Acquires a resource and assigns its corresponding object to a variable
- Allows the application to use the resource via that variable
- Calls the resource object’s close method to release the resource

Advantage of using a `with` statement with a file open is that when the `with` statement code block ends, the file closes.

At the end of the with statement’s suite, the `with` statement *implicitly* calls the file object’s `close` method to close the file

**Records**
* 100 Jones 24.98 
* 200 Doe 345.67 
* 300 Williams 0.00 
* 400 Stone -42.16 
* 500 Rich 224.62

In [19]:
### Write and reading from file using 'with' statement ###

# Open file for writing and write records
with open("accounts.txt",'w') as account:
  account.write("100 Jones 24.98\n")
  account.write("200 Doe 345.67\n")
  account.write("300 Williams 0.00\n")
  account.write("400 Stone -42.16\n")
  account.write("500 Rich 224.62")
    

In [27]:
### Reading from file ###
with open("accounts.txt",'r') as accountred:
  print(f'{"AccountID":10} {"Name":10} {"Balance":10}')
  
  for record in accountred:
    #print(record)

    accountID,name,balance = record.split()
    print(f'{accountID:10} {name:10} {balance:10}')

AccountID  Name       Balance   
100        Jones      24.98     
200        Doe        345.67    
300        Williams   0.00      
400        Stone      -42.16    
500        Rich       224.62    


# Module CSV

## Reading and Writing CSV file

Text files work fine when we are referencing small amounts of information, but when we use larger amounts of data, adding structure helps in organizing and retrieving values. 

One common format found in business and social sciences alike (as well as any field concerned with data science) is the comma-separated values (CSV) format. 

**CSV files** are the most common format used for importing and exporting data from spreadsheets and databases. 

CSV files are text files that have delimiters.  A **delimiter** is a character that separates data values. 

You can explore CSV files in spreadsheet software (such as Microsoft Excel), which will remove delimiters (usually commas) and store data values in separate cells.


One of the benefits of importing data files such as CSV files is the ability to read in a lot of data at once, parsing the data so your code can access individual values within the data. **By default, CSV files use commas (“,”) to separate data values**.



Python **CSV module** provides functions for working with CSV files

In [43]:
### Writing to a CSV file ###

#import csv module
import csv

#open file and write records
with open("accounts.csv",'w') as acc:
  writer = csv.writer(acc)
  writer.writerow([100, "Jones" ,24.98])
  writer.writerow([200, "Doe" ,345.67])
  writer.writerow([300, "Williams" ,0.00])
  writer.writerow([400, "Stone" ,-42.16])
  writer.writerow([500, "Rich" ,224.62])


In [44]:
### Reading from CSV file ###

#The csv module’s reader function returns an object that reads CSV-format data from the specified file object
import csv
#open file and read records

with open("accounts.csv",'r') as accRead:
  reader = csv.reader(accRead)

  for record in reader:
    #print(record)
    id, name, bal = record
    print(f'{id:10}, {name:10}, {bal:10}')

#open file and write records



100       , Jones     , 24.98     
200       , Doe       , 345.67    
300       , Williams  , 0.0       
400       , Stone     , -42.16    
500       , Rich      , 224.62    


## Getting data from financial website

In [None]:
!curl "https://query1.finance.yahoo.com/v7/finance/download/NAB.AX?period1=1653004800&period2=1661040000&interval=1d&events=history&includeAdjustedClose=true" > NABData.csv

In [None]:
import csv

# open file and write records



In [None]:
import csv


# open file and write records


# plot the results



# Module Pandas

## Using Pandas Module to Read in CSV Files (the “Easy Way”)

An easier method of importing files into Python is using the **Pandas** module. Pandas (name derived from “panel data”) is a data analysis library that, among other things, makes reading in CSV files and accessing the contained data much easier 

The primary data structure used in Pandas are a ***DataFrame***. A DataFrame has a *two-dimensional tabular format using rows and columns*. Using a DataFrame, we can reference columns by name, rather than having to count to figure out which column number we want. Pandas provides an assortment of methods like .mean() that will do summary statistics on our data

 
Read From a CSV File Using Pandas

Source: *Kaefer, F., & Kaefer, P. (2020). Introduction to Python 
Programming for Business and Social Science Applications. SAGE Publications, Inc. (US).*

In [None]:
!curl "https://query1.finance.yahoo.com/v7/finance/download/BHP.AX?period1=1629553321&period2=1661089321&interval=1d&events=history&includeAdjustedClose=true" > BHPData.csv

In [None]:
#import pandas module
import pandas as pd
from matplotlib import pyplot as plt

# Load the data



## Daily Return of Stock Data

### Daily Return formula

Visiting a financial website that provides stock price information. Type a company’s name or its stock’s ticker symbol. Find in the historical prices section the stock’s closing price for any two consecutive days. For example, assume a stock’s closing price was \$36.75 yesterday and that its closing price was \$35.50 the previous day. Subtract the previous day’s closing price from the most recent day’s closing price. In this example, subtract \$35.50 from \$36.75 to get \$1.25.

Now divide the results by the previous day's closing prices to calculate the daily return.  Multiply this by 100 to convert to a percentage.  So $1.25 divided by #35.50 equals 0.035.  Multiply 0.035 by 100 to 3.5 percent.

    Daily return = (Today prices - Yesterday price) / Yesterday price



To find the URL, navigate to Yahoo finance, type in NAB.AX, click the 'Historical' tab.  towards the right is a download option.  right click on the download link and select 'Copy Link Address' form the popup menu.  Now paste to the URL assignment statement.

In [None]:
#import pandas module
import pandas as pd
from matplotlib import pyplot as plt

# Load the data
data = pd.read_csv('BHPData.csv')


# Calculate the return
data['Daily_return'] = data['Adj Close'] / data['Adj Close'].shift(1)-1

# Plot the results
plt.plot(data['Date'],data['Daily_return'])

Sources:

* Kaefer, F., & Kaefer, P. (2020). Introduction to Python 
Programming for Business and Social Science Applications. SAGE Publications

* Tony Gaddis, Starting out with Python, 5th Edition

* Deitel & Deitel, Intro to Python for Computer Science and Data Science, Global Edition
