# Output, read and write files
<hr>

## `print` function
<hr>

In Python, outputting content is very straightforward using the `print()` function. Inside the parentheses, you can pass various data types such as numbers, strings, lists, dictionaries, and more.

In [1]:
print([12, 45, 69])  # list

[12, 45, 69]


In [2]:
dict = {"name": "chen", "mark": 85}
print(dict)  # dict

{'name': 'chen', 'mark': 85}


You can also **output strings and variables together** inside the `print()` function.

In [3]:
a = [1, 2, 3]
b = 4
print("a =", a, ", b =", b)

a = [1, 2, 3] , b = 4


By default, the `print()` function adds a newline character at the end of the output, moving the cursor to the next line.

In [4]:
a = "zhang"
age = 25
print("my name is",  a)  # the cursor is moving to the next line
print("age is", age)

my name is zhang
age is 25


To modify the ending behavior of the `print()` function's cursor, you can use the `end` argument, which defaults to a newline character `\n`.

In [5]:
a = "zhang"
age = 25
print("my name is %s" % a, end="-")
print("age is %d" % age)

my name is zhang-age is 25


## Using `f-string`
<hr>

We can use `f-string` for more flexible outputs of strings. The variable is in the curly braces `{ }`.

In [6]:
a = [1, 2, 3]
print(f"the array is {a}")  # output a list variable

the array is [1, 2, 3]


In [7]:
b = 3.2
print(f"the array is {a}, the numer is {b}") # output several variables

the array is [1, 2, 3], the numer is 3.2


In [8]:
import math

b = 4.56
print(f"the numbers are {math.pi} and {b}")

the numbers are 3.141592653589793 and 4.56


 We can use colons `:` to specify variable alignment, width, precision, and type in the f-string.

The format after the colons `:` is：

|: \<fill\> \<alignment\>\<width\><`.`>\<presition\>\<type\>|
|--|

The notations for alignment are：

|Code|Meaning|
|--|--|
|`>`|Align right (default)|
|`<`|Align left|
|`^`|Align center|

The notations for varaible types are:

| Code | Meaning | Example (value=42.5) | Output |
|------|---------|----------------------|--------|
| `s`  | String format (default) | `f"{'text':s}"` | `'text'` |
| `d`  | Decimal integer | `f"{42:d}"` | `42` |
| `f`  | Fixed-point float | `f"{42.5:.2f}"` | `42.50` |
| `e`  | Scientific notation | `f"{42.5:.2e}"` | `4.25e+01` |
| `%`  | Percentage format | `f"{0.425:.1%}"` | `42.5%` |


In [9]:
a = 1
b = 4.56
# The first number is a floating-point number rounded to 3 decimals
# The second number is right-aligned with a width of 20, rounded to 2 decimals, and displayed as a percentage
f"the numbers are {a:.3f} and {b = :>20.2%}"

'the numbers are 1.000 and b =              456.00%'

In [10]:
a = 5.6
print(f"{a = :+^10.2f}")  # fill with +, align center, width is 10, rounded to 2 decimals

a = +++5.60+++


## Using `format`
<hr>


We can also achieve different formats of output using `format`. The syntax is:

||string with curly braces `{ }`.format(value1, value2...)|
|:-:|:-:|


In the string, `{ }` is the location of the variable, and the values of the varaibles are in `( )`.

In [11]:
a = 3
b = 4.56
print("the numbers are {} and {:.3f}".format(a, b))

the numbers are 3 and 4.560


We can add the location index of the variable inside the `{ }`.

In [12]:
a = 3
b = 4.56
print("the numbers are {1} and {0}".format(a, b))  # specify the variable index indside {}
print("the numbers are {0} and {1}".format(a, b))
print("the numbers are {0} and {1} and {0}".format(a, b))

the numbers are 4.56 and 3
the numbers are 3 and 4.56
the numbers are 3 and 4.56 and 3


To output a curly brace, use double curly braces: `{{`, `}}`.

In [13]:
print("the numbers are {{ {0} }} and {{ {1} }}".format(a, b))

the numbers are { 3 } and { 4.56 }


When using the format() method, you can achieve formatted output by including a colon `:` and additional arguments within the curly braces {}, similar to how f-strings work.

In [14]:
a = 3
b = 4.56
print("the numbers are {0:10.2f} and {0:.3%}".format(a, b))

the numbers are       3.00 and 300.000%


In [15]:
a = 123456
print("the numer is {:e}".format(a))

the numer is 1.234560e+05


In [16]:
a = 5
print("a = {:0>5d}".format(a))  # align left, fill with 0, width 5, type integer

a = 00005


In [17]:
a = 5.6
print("a = {:+^10.2f}".format(a))  # align center, fill with +, width 10, round to 2 decimals

a = +++5.60+++


As f-string, the width and precision values can be passed as variables using curly braces {} with argument assignment, for example:

In [18]:
a = 5.6
print("a = {:^{width}.{precision}f}".format(a, width=10, precision=3))  # width and precision as variables

a =   5.600   


In [19]:
a = 5.6
width = 10
precision = 3
print(f"a = {a:^{width}.{precision}f}")  # width and precision as variables

a =   5.600   


## Using `%` for output*[^1]
<hr>

[^1]: \* means this section may not be delived in class.

The percent sign `%` can also achieve various formatted outputs, with syntax rules similar to C language. Here are some specific examples for reference:

In [20]:
import math

print("the number is %d" % math.pi)  # output a float to integer format

the number is 3


In [21]:
print("the number is %.2f" % math.pi)  # output a float rouding to 2 decimals

the number is 3.14


In [22]:
print("the number is %.2f\n" % math.pi)  # \n indicates a line break after print(), moving the cursor to the next line

the number is 3.14



In [23]:
b = 4.56
print("the numbers are %.2f and %d" % (math.pi, b))  # output multi variables

the numbers are 3.14 and 4


In [24]:
print("the number is %5.2f" % math.pi)  # output a float rounding to 2 decimals with width 5

the number is  3.14


In [25]:
print("the number is %s" % 321)  # output in a string format

the number is 321


In [26]:
print("%.1f%%" % b)  # output in a percentage format

4.6%


## Read and write files
<hr>

Python operates on text files following the steps: **open → operate → close**. It opens a file from the computer's storage location, performs some read or write operations, and then closes the file. While the file is open, it is in a **"locked"** state, meaning other processes on the computer cannot access it. Once the file is closed, it becomes accessible to other processes.

Python uses the function ``open()`` to open a file object and the syntax is:

|open(filename, mode)||
|--|--|
|filename|An string with address and file name;|
||the default address is the current file folder|
|mode|Read/write mode and the default is read-only model (`r`)|

The read/write modes include:

| Mode | Description | File Pointer | File Existence |
|------|-------------|--------------|----------------|
| `'r'`  | **Read-only** (default) | Starts at beginning | File must exist |
| `'w'`  | **Write** (overwrites existing file) | Starts at beginning | Creates file if missing |
| `'a'`  | **Append** (adds to end of file) | Starts at end | Creates file if missing |
| `'r+'` | **Read + Write** | Starts at beginning | File must exist |
| `'w+'` | **Write + Read** (overwrites) | Starts at beginning | Creates file if missing |
| `'a+'` | **Append + Read** | Starts at end | Creates file if missing |

Comparisions between different modes:

| Mode      | `r`          | `r+`         | `w`          | `w+`         | `a`          | `a+`         |
|-------|------------|------------|------------|------------|------------|------------|
| Read       | &check;| &check; |            | &check; |            | &check; |
| Write       |            | &check; | &check; | &check; | &check; |&check; |
| Create      |            |            | &check; | &check; | &check; | &check; |
| Overwrite     |            |            | &check; | &check; |            |            |
| Start at begining | &check; | &check; | &check; | &check; |            |            |
| Start at end |            |            |            |            | &check; |&check; |


For example:

In [27]:
f = open(
    "test.txt", "w"
)  # creat a file in the current folder; if creating the file under the E drive:  f = open('E:\\files\\test.txt', 'r')
f.write("My name is Tim Cook\nHis name is Elon Musk")  # write the contents and use \n for line break
f.close()  # close the opened file

The above program creates a new .txt file in the current directory. It writes content to the file using the write() method, where \n represents a newline. 
- To create or modify a file at a specific location, you can further include a **file path string in the filename**.

The following codes read all the contents from the file through the method ``read()`` and return a string.

In [28]:
f = open("test.txt", "r")
str = f.read()  # read the contents and return a string
print(str)  # print the string
f.close()  # close the opened file

My name is Tim Cook
His name is Elon Musk


To read just one line, you can use the `readline()` method. Alternatively, you can use the `readlines()` method to read all lines, which returns a **list** where each element is a line from the file.

- In practical Python file reading and writing, the `with-open` statement is often used. Since file operations may raise an IOError, if an error occurs, the `f.close()` statement may not be executed, leaving the file open in the background. Using the `with-open` statement allows you to **omit the f.close()** call safely.

The above reading and writing code can also be rewritten as:

In [29]:
# write contents
with open("test.txt", "w") as f:
    f.write("My name is Tim Cook\nHis name is Elon Musk")

# read contents
with open("test.txt", "r") as f:
    str = f.read()  
    print(str)  

My name is Tim Cook
His name is Elon Musk


To read and write data format files such as .txt, .xls, .xlsx, and .csv, it is common to use the `Pandas` library to read files and process data. Readers can refer to the Pandas chapter in this book for more details.

To read and write Word format files, libraries such as `textract` and `docx2txt` can be used. This book does not go into further detail on this topic—interested readers can look up relevant information online.

## `os`, `sys` library*
<hr>

###  `os` library
<hr>

The `os` in Python is a standard library for interacting with the operating system, providing functionalities for handling files, directories, processes, and environment variables. 

Below are some usages of the `os`:

```python
import os

print(os.name)  # get the name of the current computer system: 'posix' (Linux/macOS) or 'nt' (Windows)
print(os.getcwd())  # get current working directory
print(os.listdir())  # get a list of all the files under the current working directory
```

- Change working directory

```python
os.chdir('/path/to/directory')  # change current working directory to new address: '/path/to/directory'
```

- Create or delete a directory

```python
import os

os.mkdir('new_dir')  # create a new directory
os.makedirs('parent/child/grandchild')  # create hierarchy directories
os.rmdir('new_dir')  # delete an empty
os.removedirs('parent/child/grandchild') # remove hierarchy directories
```

- File renaming `rename` or deletion `remove`

```python
import os

file_path = "new_file.txt"  
with open(file_path, "w") as f:
    f.write("Hello, this is a new file!")

os.rename('new_file.txt', 'new_file2.txt')  # rename the file
os.remove('new_file2.txt')  # remove the file
```

- Get environment variable

```python
print(os.environ)  # get all the environment variables
print(os.environ.get('specific'))  # get a specific environment variable 'specific'
os.environ['NEW_VAR'] = 'Hello'  # set value for an enviroment variable
```

 - Path operations

```python
import os.path

print(os.path.abspath('file.txt'))  # get the absolute path of the file 'file.txt'
print(os.path.dirname('file.txt'))  # get the directory of the file 'file.txt'
print(os.path.join('dir', 'file.txt'))  # concatenate the path strings
print(os.path.exists('file.txt'))  # check whethe the path 'file.txt' exists
print(os.path.isfile('file.txt'))  # check whether the string 'file.txt' is a file or not
print(os.path.isdir('dir'))  # check whether the string 'dir' is a directory or not
print(os.path.splitext('file.txt'))  # split the pathname into a pair (root, ext), where root is the part of the path before the file extension and ext is the file extension.
print(os.path.basename('/path/to/file.txt'))  # get the file name from the string inside the ()
```

The following example gets the file name of the current file.

```python
import os

# __file__ can get the full path of the current file
current_file_path = __file__ 

# get the current file name
current_file_name = os.path.basename(current_file_path)
```

- Process management

In [30]:
import os

print(os.getpid())  # get the id of the current process
print(os.getppid())  # get the id of the parent process

2199
2152


###  `sys` library
<hr>

`sys` 模块提供了一些与 Python 解释器及其环境相关的变量和函数，适用于处理命令行参数、运行时环境、标准输入输出等。

- Get the version of current Python using `sys.version`

In [31]:
import sys

print(sys.version)

3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:54:21) [Clang 16.0.6 ]


- Get the name of current computer system using `sys.platform`

In [32]:
print(sys.platform)  # get current computer system such as  'win32', 'linux', 'darwin'

darwin


- Get the seaching paths by `sys.path`

```python
print(sys.path)  # get all the searching paths
sys.path.append("/my/custom/path")  # append a searching path
```

## Exercises
<hr>

```{exercise-start}
:label: read-content
```
Which method is used to read the entire content of a file as a string?


A.&nbsp;&nbsp;  read()

B.&nbsp;&nbsp;  readline()

C.&nbsp;&nbsp;  readlines()

D.&nbsp;&nbsp;  read_file()

```{exercise-end}
```

````{solution} read-content
:class: dropdown
A
````

```{exercise-start}
:label: output-number2
```
What is the output of the following program:

```python
print ('{0:.2%}'.format(1.0 / 3))
```


A.&nbsp;&nbsp;  0.33

B.&nbsp;&nbsp;  33.33%

C.&nbsp;&nbsp;  0.33%

D.&nbsp;&nbsp;  33%

```{exercise-end}
```

````{solution} output-number2
:class: dropdown
B
````

```{exercise-start}
:label: default-read
```
The default opening mode when opening a file with the open() function is 'r' for 'reading'.


A.&nbsp;&nbsp;  True

B.&nbsp;&nbsp;  False

```{exercise-end}
```

````{solution} default-read
:class: dropdown
A
````

```{exercise-start}
:label: with
```
What is the purpose of the 'with' statement in file handling?


A.&nbsp;&nbsp;  create a new file

B.&nbsp;&nbsp;  open a file

C.&nbsp;&nbsp;  automatically close the file

D.&nbsp;&nbsp;  save a file

```{exercise-end}
```

````{solution} with
:class: dropdown
C
````

```{exercise-start}
:label: write-w
```
What happens to the original file content if you open a file like this:
```python
open('test.txt', 'w')
```

A.&nbsp;&nbsp;  The original content will be overwritten

B.&nbsp;&nbsp;  Any new content will be added after the original content

```{exercise-end}
```

````{solution} write-w
:class: dropdown
A
````

```{exercise-start}
:label: write-a
```
What does the following code do?

```python
with open("file.txt", "a") as file:
   file.write("data")
```

A.&nbsp;&nbsp;  read the content  of 'file.txt'

B.&nbsp;&nbsp;  append "data" to "file.txt"

C.&nbsp;&nbsp;  create a new file "file.txt"

D.&nbsp;&nbsp;  replace the content of "file.txt" with "data"

```{exercise-end}
```

````{solution} write-a
:class: dropdown
C
````

```{exercise}
:label: output-number
Formatted output of 0.0003278 in scientific notation, rouding to 4 decimal places as percentage
```

````{solution} output-number
:class: dropdown

```{code-block} python
print("{:.4%}".format(0.0003278))
```

or

```{code-block} python
a = 0.0003278
print(f"{a:.4%}")
```
````

```{exercise}
:label: tesla
Write the following txts to a file 'tesla.txt', read it and output the contents.

Tesla, Inc. is an American multinational automotive and clean energy company. Headquartered in Austin, Texas, it designs, manufactures and sells battery electric vehicles (BEVs), stationary battery energy storage devices from home to grid-scale, solar panels and solar shingles, and related products and services.

Tesla was incorporated in July 2003 by Martin Eberhard and Marc Tarpenning as Tesla Motors. Its name is a tribute to inventor and electrical engineer Nikola Tesla. In 2008, the company began production of its first car model, the Roadster sports car, followed by the Model S sedan in 2012, the Model X SUV in 2015, the Model 3 sedan in 2017, the Model Y crossover in 2020, the Tesla Semi truck in 2022 and the Cybertruck pickup truck in 2023.
```

````{solution} tesla
:class: dropdown

```{code-block} python
with open("tesla.txt", "w") as f:
    f.write(
        """Tesla, Inc. is an American multinational automotive and clean energy company. Headquartered in Austin, Texas, it designs, manufactures and sells battery electric vehicles (BEVs), stationary battery energy storage devices from home to grid-scale, solar panels and solar shingles, and related products and services.
        
Tesla was incorporated in July 2003 by Martin Eberhard and Marc Tarpenning as Tesla Motors. Its name is a tribute to inventor and electrical engineer Nikola Tesla. In 2008, the company began production of its first car model, the Roadster sports car, followed by the Model S sedan in 2012, the Model X SUV in 2015, the Model 3 sedan in 2017, the Model Y crossover in 2020, the Tesla Semi truck in 2022 and the Cybertruck pickup truck in 2023."""
    )

with open("tesla.txt", "r") as f:
    str = f.read()
    print(str)  
```
````

<script src="https://giscus.app/client.js"
        data-repo="robinchen121/book-Python-Data-Science"
        data-repo-id="R_kgDOKFdyOw"
        data-category="Announcements"
        data-category-id="DIC_kwDOKFdyO84CgWHi"
        data-mapping="pathname"
        data-strict="0"
        data-reactions-enabled="1"
        data-emit-metadata="0"
        data-input-position="bottom"
        data-theme="light"
        data-lang="en"
        crossorigin="anonymous"
        async>
</script>

<!-- Toogle google translation -->
<div id="google_translate_element"></div>

<script type="text/javascript">
      function googleTranslateElementInit() {
        new google.translate.TranslateElement({ pageLanguage: 'zh-CN',
                  includedLanguages: 'en,zh-CN,zh-TW,ja,ko,de,ru,fr,es,it,pt,hi,ar,fa',
layout: google.translate.TranslateElement.InlineLayout.SIMPLE }, 'google_translate_element');
      }
</script>
<script type="text/javascript"
      src="https://translate.google.com/translate_a/element.js?cb=googleTranslateElementInit"
></script>
<br>