# Reading and writing file

## Reading file

In [1]:
file = open("spider.txt")
print(file.readline())

The itsy bitsy spider climbed up the waterspout.



In [2]:
print(file.readline())

Down came the rain



In [3]:
print(file.read())

and washed the spider out.
Out came the sun
and dried up all the rain
and the itsy bitsy spider climbed up the spout again.


In [4]:
file.close()

These lines print the first three lines of the file. The `readline()` method reads one line from the file and returns it as a string. The `read()` method reads the entire file and returns it as a string. The `close()` method closes the file.

In [6]:
with open("spider.txt") as file:
    print(file.readline())

The itsy bitsy spider climbed up the waterspout.



Finally, the line that uses the `with` statement to open the file `spider.txt` is in read mode. The as keyword assigns the file object to the variable `file`. The code block inside the `with` statement will be executed, and then the file will be **closed automatically.**

## Iterating through files

In [7]:
with open("spider.txt") as file:
    for line in file:
        print(line.upper())

THE ITSY BITSY SPIDER CLIMBED UP THE WATERSPOUT.

DOWN CAME THE RAIN

AND WASHED THE SPIDER OUT.

OUT CAME THE SUN

AND DRIED UP ALL THE RAIN

AND THE ITSY BITSY SPIDER CLIMBED UP THE SPOUT AGAIN.


Here there are spaces between the lines in the output. This is because there is a new line character at the end of each line. 

In [8]:
with open("spider.txt") as file:
    for line in file:
        print(line.strip().upper())

THE ITSY BITSY SPIDER CLIMBED UP THE WATERSPOUT.
DOWN CAME THE RAIN
AND WASHED THE SPIDER OUT.
OUT CAME THE SUN
AND DRIED UP ALL THE RAIN
AND THE ITSY BITSY SPIDER CLIMBED UP THE SPOUT AGAIN.


Here strip is used to remove the newline character, and we get the output without empty lines.

In [9]:
file = open("spider.txt")
lines = file.readlines()
file.close()
lines.sort()
print(lines)

['Down came the rain\n', 'Out came the sun\n', 'The itsy bitsy spider climbed up the waterspout.\n', 'and dried up all the rain\n', 'and the itsy bitsy spider climbed up the spout again.', 'and washed the spider out.\n']


the lines have been sorted alphabetically, so they're no longer in the order that they were in the file. We can see that Python displays a newline character using "\n" symbol when printing a list of strings. 

## Writing files

In [1]:
with open("novel.txt", "w") as file:
    file.write("It was a dark and stormy night")

The with `open()` statement creates a file object and assigns it to the variable file. The `open()` function takes two arguments: the name of the file and the mode. In this case, the mode is `w`, which means "write". This tells the `open()` function to create a new file if it doesn't exist, or to overwrite the existing file if it does exist.

| Character | Meaning                                                         |
|-----------|-----------------------------------------------------------------|
| 'r'       | open for reading (default)                                      |
| 'w'       | open for writing, truncating the file first                     |
| 'x'       | open for exclusive creation, failing if the file already exists |
| 'a'       | open for writing, appending to the end of file if it exists     |
| 'b'       | binary mode                                                     |
| 't'       | text mode (default)                                             |
| '+'       | open for updating (reading and writing)                         |

There are 6 access modes in Python:

- Read Only (`r`)
- Read and Write (`r+`)
- Write Only (`w`)
- Write and Read (`w+`)
- Append Only (`a`)
- Append and Read (`a+`)

```python
with open("sample_data/declaration.txt", "rt") as textfile:
 for line in textfile:
   print(line)

```
In this example, the first argument is a string containing the filename (sample_data/declaration.txt). The second argument identifies the mode or the way in which the file will be used (rt).

```python
f = open("sample_data/declaration.txt", “w”)
```
In this example, the code tells Python to open this file for writing (“w” mode). 

**Encoding**

Python distinguishes between binary mode (“b”) and text mode (“t”). By default, files are opened in the text mode, which means you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform-dependent. This means that locale.getencoding() is called to get the current locale encoding. If you need to open the text in a specific encoding, you must specify it.

```python
f = open('workfile', 'w', encoding="utf-8")
```
In this example, the encoding=“utf-8” specifies that the file should be opened with UTF-8, the modern de facto standard. Binary mode data is read and written as bytes objects. You cannot specify encoding when opening a file in binary mode.

## How to write file paths in code

Windows file directory

```
C:\my-directory\target-file.txt
```

Windows file directory written in Python

```
C:/my-directory/target-file.txt.
```

Windows file directory

```
C:\\my-directory\\target-file.txt
```

CWD command: 
```
os.getcwd()
```

CWD command for external files:

```
outputs['current_directory_before'] = os.getcwd()
```

In [2]:
import os
os.getcwd()

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python'

In [11]:
outputs = {}

In [12]:
outputs['current_directory_before'] = os.getcwd()

In [13]:
outputs['files_and_directories'] = os. listdir()

In [16]:
outputs['path_value'] = os.environ.get ('PATH' )

In [17]:
outputs

{'current_directory_before': 'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python',
 'files_and_directories': ['.git',
  '.ipynb_checkpoints',
  'novel.txt',
  'reading-writing-file.ipynb',
  'README.md',
  'spider.txt'],
 'path_value': 'C:\\SIMULIA\\Commands;C:\\Program Files\\Microsoft MPI\\Bin\\;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\System32\\Wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Windows\\System32\\OpenSSH\\;C:\\Program Files\\Polyspace\\R2021a\\runtime\\win64;C:\\Program Files\\Polyspace\\R2021a\\bin;C:\\Program Files\\Microsoft SQL Server\\150\\Tools\\Binn\\;C:\\Program Files\\Git\\cmd;C:\\Program Files\\dotnet\\;C:\\Users\\Asus\\AppData\\Local\\Programs\\Python\\Python311\\Scripts\\;C:\\Users\\Asus\\AppData\\Local\\Programs\\Python\\Python311\\;C:\\Users\\Asus\\AppData\\Local\\Microsoft\\WindowsApps;;C:\\Users\\Asus\\AppData\\Local\\Programs\\Microsoft VS Code\\bin'}

## Working with files

In [19]:
import os
os.remove("novel.txt")

This code removes the file novel.txt

In [20]:
import os
os.remove("novel.txt")

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'novel.txt'

This code will throw a file not found error. You cannot remove a file that doesn’t exist.

```python
os.rename("first_draft.txt", "finished_masterpiece.txt")
```
This code can be used to rename a file. 

In [28]:
os.path.exists("README.md")

True

In [29]:
os.path.exists("userlist.txt")

False

This code checks whether or not a file exists. If the file exists it will return True. If the file does not exist it will return False.

## More file information

In [30]:
os.path.getsize("spider.txt")
#This code will provide the file size

191

In [31]:
os.path.getmtime("spider.txt")
#This code will provide a unix timestamp for the file

1721209292.3631647

In [58]:
import datetime
timestamp = os.path.getmtime("spider.txt")
dt_object = datetime.datetime.fromtimestamp(timestamp)
#This code will provide the date and time for the file in an 
#easy-to-understand format

In [60]:
print()

2024-07-17


In [33]:
os.path.abspath("spider.txt")
#This code takes the file name and turns it into an absolute path

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python\\spider.txt'

C:\Users\Asus\Desktop\GitHUb


## Directories

In [34]:
print(os.getcwd())
#This code snippet returns the current working directory.

C:\Users\Asus\Desktop\GitHUb\OS-Python


In [35]:
os.mkdir("new_dir")
#The os.mkdir("new_dir") function creates a new directory called new_dir

In [36]:
os.chdir("new_dir")
os.getcwd()
#This code snippet changes the current working directory to new_dir. 
#The second line prints the current working directory.

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python\\new_dir'

In [42]:
os.chdir("C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python")
os.getcwd()

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python'

In [38]:
os.mkdir("newer_dir")

In [39]:
os.rmdir("newer_dir")
#This code snippet creates a new directory called newer_dir. 
#The second line deletes the newer_dir directory.

In [44]:
os.listdir("C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python")
#This code snippet returns a list of all the files and 
#sub-directories in the website directory.

['.git',
 '.ipynb_checkpoints',
 'new_dir',
 'reading-writing-file.ipynb',
 'README.md',
 'spider.txt']

In [43]:
 dir = "C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python"
 for name in os.listdir(dir):
     fullname = os.path.join(dir, name)
     if os.path.isdir(fullname):
          print("{} is a directory".format(fullname))
     else:
          print("{} is a file".format(fullname))

C:\Users\Asus\Desktop\GitHUb\OS-Python\.git is a directory
C:\Users\Asus\Desktop\GitHUb\OS-Python\.ipynb_checkpoints is a directory
C:\Users\Asus\Desktop\GitHUb\OS-Python\new_dir is a directory
C:\Users\Asus\Desktop\GitHUb\OS-Python\reading-writing-file.ipynb is a file
C:\Users\Asus\Desktop\GitHUb\OS-Python\README.md is a file
C:\Users\Asus\Desktop\GitHUb\OS-Python\spider.txt is a file


Here is the code all together. This code defines a dir variable with the name of the directory that we want to check. This makes our code more readable and more usable. Then, it iterates through the file names returned by the os.listdir(). We know from our previous execution of this function that these are just the names of the files without directory. By using os.path.join(), we join the directory to each of those file names and create a string with a valid full name. Finally, we use that full name to call os.path.isdir() to check if it's a directory or a file. 

## Files and directories

Let’s take a look at two examples. The first example uses OS; the second uses Pathlib. These two code examples do the same thing: They create a directory called test1 and move a file named README.md from the sample_data folder into test1.

An example of using the OS function to create a directory and move a file:

In [46]:
# Create a directory and move a file from one directory to another
# using low-level OS functions.

import os

# Check to see if a directory named "test1" exists under the current
# directory. If not, create it:
dest_dir = os.path.join(os.getcwd(), "test1")
if not os.path.exists(dest_dir):
 os.mkdir(dest_dir)


# Construct source and destination paths:
src_file = os.path.join(os.getcwd(), "sample_data", "README.md")
dest_file = os.path.join(os.getcwd(), "test1", "README.md")


# Move the file from its original location to the destination:
os.rename(src_file, dest_file)

In [49]:
# Create a directory and move a file from one directory to another
# using Pathlib.

from pathlib import Path

# Check to see if the "test1" subdirectory exists. If not, create it:
dest_dir = Path("./test1/")
if not dest_dir.exists():
  dest_dir.mkdir()

# Construct source and destination paths:
src_file = Path("./sample_data/README.md")
dest_file = dest_dir / "README.md"

# Move the file from its original location to the destination:
src_file.rename(dest_file)

WindowsPath('test1/README.md')

**The OS module**

Python’s OS module, or the miscellaneous operating system interface, is very useful for file operations, directories, and permissions. Let’s take a look at each.

- **File operations**

File names can be thought of as two names separated by a dot. For example, helloworld.txt is the file name and the extension defines the file type. OS provides functions to create, read, update, and delete files. Some of the basic functions include:

     - Opening and closing files
     - Reading from and writing to files
     - Appending to files


- **Directories**
OS also provides functions to create, read, update, and delete directories, as well as change directories and list files. Knowing how to use these functions is key to working with files. For example, `os.listdir( path )` returns a list of all files and subdirectories in a directory.

- **Permissions**
Having the ability to update file permissions is an important aspect of making installations from a terminal window. The `os.chmod()` provides the ability to create, read, and update permissions for individuals or groups.

- **Things to keep in mind**
One thing to be aware of is that Python treats text and binary files differently. Because Python is cross-platform, it tries to automatically handle different ASCII line endings. If you’re processing a binary file, make sure to open it in binary mode so Python doesn’t try to “fix” newlines in a binary file.

A best practice is to always close() a file when you’re done reading or writing to it. Even though Python usually closes them for you, it’s a good signal to other people reading your code that you’re done with that file. Make sure to catch any potential errors from filesystem calls, such as permission denied, file not found, and so on. Generally, you wrap them in try/except to handle those errors.

## exercise

Question 1
The create_python_script function creates a new python script in the current working directory, adds the line of comments to it declared  by the 'comments' variable, and returns the size of the new file. Fill in the gaps to create a script called "program.py".

In [63]:
import os
def create_python_script(filename):
  comments = "# Start of a new Python program"
  with open(filename, "w") as file:
    file.write(comments)

  filesize = os.path.getsize(filename)
  return(filesize)

print(create_python_script("program.py"))

31


In [64]:
os.getcwd()

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python'

The parent_directory function returns the name of the directory that's located just above the current working directory. Remember that '..' is a relative path alias that means "go up to the parent directory". Fill in the gaps to complete this function.

In [65]:
import os

def parent_directory():
    # Create a relative path to the parent 
    # of the current working directory 
    relative_parent = os.path.join(os.getcwd(), '..')
    
    # Return the absolute path of the parent directory
    return os.path.abspath(relative_parent)

print(parent_directory())

C:\Users\Asus\Desktop\GitHUb


Question 2
The new_directory function creates a new directory inside the current working directory, then creates a new empty file inside the new directory, and returns the list of files in that directory. Fill in the gaps to create a file "script.py" in the directory "PythonPrograms".

In [67]:
import os

def new_directory(directory, filename):
  # Before creating a new directory, check to see if it already exists
  if os.path.isdir(directory) == False:
    os.mkdir(directory)

  # Create the new file inside of the new directory
  os.chdir(directory)
  with open (filename,"w") as file:
    pass

  # Return the list of files in the new directory
  return os.listdir()

print(new_directory("PythonPrograms", "script.py"))

['script.py']


The file_date function creates a new file in the current working directory, checks the date that the file was modified, and returns just the date portion of the timestamp in the format of yyyy-mm-dd. Fill in the gaps to create a file called "newfile.txt" and check the date that it was modified.



In [66]:
import os
import datetime

def file_date(filename):
  # Create the file in the current directory
  with open(filename, "w") as file:
    pass
  timestamp = os.path.getmtime(filename)
  # Convert the timestamp into a readable format, then into a string
  dt_ob =datetime.datetime.fromtimestamp(timestamp)
  # Return just the date portion 
  # Hint: how many characters are in “yyyy-mm-dd”? 
  return ("{}".format(dt_ob.strftime('%Y-%m-%d')))

print(file_date("newfile.txt")) 
# Should be today's date in the format of yyyy-mm-dd

2024-07-17


# Reading and writing CSV file

In [73]:
os.chdir("C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python")
os.getcwd()

'C:\\Users\\Asus\\Desktop\\GitHUb\\OS-Python'

In [85]:
import csv

f = open("csv_file.csv")
csv_f = csv.reader(f)
for row in csv_f:
    name, phone, role = row
    print("Name: {}, Phone: {}, Role: {}".format(name, phone, role))
f.close()

Name: Sabrina Green, Phone: 802-867-5309, Role: System Administrator
Name: Eli Jones, Phone: 684-3481127, Role: IT specialist
Name: Melody Daniels, Phone: 846-687-7436, Role: Programmer
Name: Charlie Rivera, Phone: 698-746-3357, Role: Web Developer


In [86]:
csv_f.line_num

4

## Generating CSV

In [80]:
import csv

hosts = [["workstation.local", "192.168.25.46"],["webserver.cloud", "10.2.5.6"]]
with open('hosts.csv', 'w') as hosts_csv:
    writer = csv.writer(hosts_csv)
    writer.writerows(hosts)

## Reading and writing CSV Files with Dictionaries

In [81]:
with open('software.csv') as software:
    reader = csv.DictReader(software)
    for row in reader:
      print(("{} has {} users").format(row["name"], row["users"]))

MailTree has 324 users
CalDoor has 22 users
Chatty Chicken has 4 users


Here the code is opening the file and creating a DictReader to process our CSV data. Then, it’s going through the rows to access information in each row using the keys just like we would when accessing data in the dictionary. 

In [82]:
users = [ {"name": "Sol Mansi", "username": "solm", "department": "IT infrastructure"}, 
 {"name": "Lio Nelson", "username": "lion", "department": "User Experience Research"}, 
  {"name": "Charlie Grey", "username": "greyc", "department": "Development"}]

keys = ["name", "username", "department"]

with open('by_department.csv', 'w') as by_department:
    writer = csv.DictWriter(by_department, fieldnames=keys)
    writer.writeheader()
    writer.writerows(users)

## exercise

Question 1
We're working with a list of flowers and some information about each one. The create_file function writes this information to a CSV file. The contents_of_file function reads this file into records and returns the information in a nicely formatted block. Fill in the gaps of the contents_of_file function to turn the data in the CSV file into a dictionary using DictReader.

In [88]:
import os
import csv

# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")


# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""

  # Call the function to create the file 
  create_file(filename)

  # Open the file
  with open(filename) as file:
    # Read the rows of the file into a dictionary
    rows = csv.DictReader(file)
    # Process each item of the dictionary
    for row in rows:
      return_string += "a {} {} is {}\n".format(row["color"], row["name"], row["type"])
  return return_string


#Call the function
print(contents_of_file("flowers.csv"))

a pink carnation is annual
a yellow daffodil is perennial
a blue iris is perennial
a red poinsettia is perennial
a yellow sunflower is annual



Using the CSV file of flowers again, fill in the gaps of the contents_of_file function to process the data without turning it into a dictionary. How do you skip over the header record with the field names?

In [90]:
import os
import csv

# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")

# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""

  # Call the function to create the file 
  create_file(filename)

  # Open the file
  with open(filename) as file:
    # Read the rows of the file
    rows = csv.reader(file)
    next(rows)
    # Process each row
    for row in rows:
      a,b,c = row
      # Format the return string for data rows only

      return_string += "a {} {} is {}\n".format(b,a,c)
  return return_string

#Call the function
print(contents_of_file("flowers.csv"))

a pink carnation is annual
a yellow daffodil is perennial
a blue iris is perennial
a red poinsettia is perennial
a yellow sunflower is annual

