## Download Data

In [1]:
import os
dir_name = os.path.join(".", "data")
os.makedirs(dir_name, exist_ok=True)

In [2]:
import requests

url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/data/example1.txt"

response = requests.get(url, stream=True)
filename = os.path.join(dir_name, "example1.txt")

if response.status_code == 200:
    total_size = int(response.headers.get("Content-Length", 0))
    downloaded_size = 0
    with open(filename, "wb") as file:
        for chunk in response.iter_content(chunk_size=1024 ** 2):
            if chunk:
                file.write(chunk)
                downloaded_size += len(chunk)
                percentage = downloaded_size / total_size * 100
                print(f"Downloading: {percentage}%")
    print("Done.")
else:
    print(f"Error. HTTP code:{response.status_code}.")

Downloading: 100.0%
Done.


### Reading text files

<p>One way to read or write a file in Python is to use the built-in <code>open</code> function. The <code>open</code> function provides a <b>File object</b> that contains the methods and attributes you need in order to read, save, and manipulate the file. In this notebook, we will only cover <b>.txt</b> files. The first parameter you need is the file path and the file name.</p>

<p>The mode argument is optional and the default value is <code>r</code>. In this notebook we only cover two modes:</p>

<ul>
    <li><b>r</b>: Read mode for reading files </li>
    <li><b>w</b>: Write mode(including overwrite) for writing files</li>
</ul>

<p>For the next example, we will use the text file <b>Example1.txt</b>. We read the file:</p>

In [3]:
file1 = open(filename, "r")

<p>The name of the file:</p>

In [4]:
file1.name

'.\\data\\example1.txt'

<p>The mode the file object is in:</p>

In [5]:
file1.mode

'r'

<p>We can read the file and assign it to a variable:</p>

In [6]:
file_content = file1.read()
file_content

'This is line 1 \nThis is line 2\nThis is line 3'

<p>The <b>/n</b> means that there is a new line.

<p>We can print the file:</p>

In [7]:
print(file_content)

This is line 1 
This is line 2
This is line 3


<p>The file is of type string.</p>

In [8]:
type(file_content)

str

<p>It is very important that the file is closed in the end. This frees up resources and ensures consistency across different python versions.</p>

In [9]:
file1.close()

### A better way to open a file

<p>Using the <code>with</code> statement is better practice, it automatically closes the file even if the code encounters an exception. The code will run everything in the indent block then close the file object.</p>

In [10]:
with open(filename, "r") as file:
    file_content = file.read()
print(file_content)

This is line 1 
This is line 2
This is line 3


<p>The file object is closed, you can verify it by running the following cell:</p>

In [11]:
file.closed

True

<p>The syntax is a little confusing as the file object is after the <code>as</code> statement.</p>

<p>We don't have to read the entire file, for example, we can read the first 4 characters by entering three as a parameter to the method <code>.read()</code>:</p>

In [12]:
with open(filename, "r") as file:
    print(file.read(4))

This


<p>Once the method <code>.read(4)</code> is called the first 4 characters are called. If we call the method again, the next 4 characters are called. The output for the following cell will demonstrate the process for different inputs to the method <code>read()</code>:</p>

In [13]:
with open(filename, "r") as file:
    print(file.read(4))
    print(file.read(5))
    print(file.read(10))
    print(file.read(15))

This
 is l
ine 1 
Thi
s is line 2
Thi


<p>Here is an example using the same file, but instead we read 16, 5, and then 9 characters at a time:</p>

In [14]:
with open(filename, "r") as file:
    print(file.read(16))
    print(file.read(5))
    print(file.read(9))

This is line 1 

This 
is line 2


<p>We can also read one line of the file at a time using the method <code>readline()</code>:</p>

In [15]:
with open(filename, "r") as file:
    print(f"first line: {file.readline()}")

first line: This is line 1 



<p>We can also pass an argument to <code>readline()</code> to specify the number of characters we want to read. However, unlike <code>read()</code>, <code>readline()</code> can only read one line at most.</p>

In [16]:
with open(filename, "r") as file:
    print(file.readline(20))    # does not read past the end of line
    print(file.read(20))   # Returns the next 20 chars

This is line 1 

This is line 2
This 


<p>We can use a loop to iterate through each line:</p>

In [17]:
with open(filename, "r") as file:
    i = 0
    for line in file:
        print(f"Iteration {str(i)}: {line}")
        i = i + 1

Iteration 0: This is line 1 

Iteration 1: This is line 2

Iteration 2: This is line 3


<p>We can use the method <code>readlines()</code> to save the text file to a list:</p>

In [18]:
with open(filename, "r") as file:
    file_as_a_list = file.readlines()

<p>Each element of the list corresponds to a line of text:</p>

In [19]:
file_as_a_list[0]

'This is line 1 \n'

In [20]:
file_as_a_list[1]

'This is line 2\n'

In [21]:
file_as_a_list[2]

'This is line 3'

****
This is the end of the file.
****