# Working with Files in Python

In the last lesson, we discussed input and output, but only manual, in this lecture we will look in detail at file input and output.

## 1. Why work with files?
I mean, really, why? We had a normal life! Well, you should! And here's why: 
Look at the programs you use. Did you look? Carefully? How many of them don't work with files at all? Zero. Games take textures or save information from files, text editors take text, graphics editors take image data. Almost all serious programs read/write information to files in one way or another.
## 2. How to work with files?
The good news is that working with files at a basic level is not difficult at all! It will be almost no different from reading and outputting data to the console.
The algorithm is simple. Let's understand it.

Python provides a built-in open() function to work with files. This function is used to open a file and create a file object, which can then be used to perform various operations such as reading, writing, or modifying the file.
### Opening a File
To open a file, you need to use the open() function and provide the file path and the mode in which you want to open the file. The file path can be either an absolute path (the complete path from the root directory) or a relative path (the path relative to the current working directory).

In [3]:
file = open('file.txt', 'r')

In the above example, 'r' is the mode, which stands for read mode. Other common modes include:

* 'w': Write mode (overwrites the file if it exists, creates a new file if it doesn't)
* 'a': Append mode (opens the file for appending, creates a new file if it doesn't exist)
* 'x': Exclusive creation mode (creates a new file but raises an error if the file already exists)
* 'b': Binary mode (used for non-text files like images, videos, etc.)

### Reading from a File
There are several ways to read data from a file: read to read the entire content, readline to read one line, and readlines to read all lines into a list.

In [None]:
# Read the entire file
contents = file.read()

# Read a single line
line = file.readline()

# Read multiple lines
lines = file.readlines()

### Writing to a File
To write to a file, you need to open it in write ('w') or append ('a') mode:

In [9]:
file = open('example.txt', 'w')  # Open the file for writing

text= '''
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let's do more of those!
'''
file.write(text)  # Write a string to the file
file.close()

### Closing a File
It's important to close a file after you've finished working with it to release system resources and ensure that any buffered data is written to the file:

In [None]:
file.close()

**Now we also know how to record with print**

In [6]:
print("Do it", file=open("new_file.txt", "w"))
!cat new_file.txt #cat utility check if there's anything there

Do it


But let's read it again

In [7]:
f_new = open("new_file.txt", 'r')
f_new.read()

'Do it\n'

And once again

In [8]:
f_new.read()

''

**_Why is there nothing?_**

The reason is that reading works in the following way:

We essentially create a cursor that moves through the file and reads it element by element, moving this cursor. Any call to `read` reads this entire part and stays at the end.

Therefore, if you call it twice in a row, you won't read anything the second time (the cursor can't go back now), which is a problem.

Let's learn how to move the cursor the way we want. This is handled by:

1. `read(n)` - read `n` characters from the cursor position

2. `tell()` - tell us which character we're currently at

3. `seek(offset)` - move to the `offset` position relative to the beginning of the file

In [10]:
with open("example.txt", "r") as f:  # with allows you to open a file and use it only within the loop, after which the file will be closed
    print(f.tell())
    print(f.read(10))  # read not the entire text, but only the first 10 characters
    print(f.tell())  # tell us where our cursor is currently
    print(f.seek(5))  # take the beginning of the file and move 5 characters
    print(f.tell())
    print(f.read(5))
    print(f.tell())

0

Beautiful
10
5
5
tiful
10


How can we move not relative to the beginning of the file, but relative to the current position?

In [None]:
with open("example.txt", "r") as f:
    print(f.read(10))
    print(f.tell())
    print(f.seek(f.tell() - 5))  # you can pass tell() directly!
    print(f.tell())

What else do we see as a problem? The fact that newline characters (and other characters like tabs, etc.) are displayed as separate characters.

What if we try to read line by line?

In [None]:
with open("example.txt", "r") as f:
    for line in f:  # reads line by line until a newline
        print(line.strip())
    print('-' * 30)

We can also print and read from the same place:

**Learning to print and read from the same place**

In [12]:
fh = open('example.txt', 'r+')  # 'r+' allows both writing and reading
fh.seek(11)
print(fh.read(5))
print(fh.tell())
fh.seek(11)
fh.write('Zen')
fh.seek(0)
content = fh.read()
print(content)
fh.close()

is be
16

Beautiful Zenbetter than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let's do more of those!



## Context Managers (with statement)
Python provides a convenient way to open and close files using the with statement, which automatically takes care of closing the file, even if an exception occurs:

In [None]:
with open('path/to/file.txt', 'r') as file:
    contents = file.read()

## File Modes
Python supports various file modes that control how the file is opened and accessed. Here are some common file modes:

- 'r': Read mode (default mode)
- 'w': Write mode (overwrites the file if it exists, creates a new file if it doesn't)
- 'a': Append mode (opens the file for appending, creates a new file if it doesn't exist)
- 'x': Exclusive creation mode (creates a new file but raises an error if the file already exists)
- 'b': Binary mode (used for non-text files like images, videos, etc.)
- 't': Text mode (default mode, used for text files)
- '+': Updating mode (allows both reading and writing)

You can combine modes like 'rb' (read binary mode) or 'w+b' (write and read binary mode).
## File Operations
Python provides several built-in functions and methods for performing various operations on files:
### Renaming and Moving Files
To rename or move a file, you can use the os.rename() function from the os module:

In [None]:
import os

source_path = 'path/to/source.txt'
destination_path = 'path/to/destination.txt'
os.rename(source_path, destination_path)

### Deleting Files
To delete a file, you can use the os.remove() function from the os module:

In [None]:
import os

file_path = 'path/to/file.txt'
os.remove(file_path)

### Getting File Information
Python provides several functions to retrieve information about files, such as os.path.exists(), os.path.isfile(), os.path.isdir(), os.path.getsize(), and os.path.getmtime().

In [None]:
import os

file_path = 'path/to/file.txt'

# Check if the file exists
if os.path.exists(file_path):
    # Check if it's a regular file
    if os.path.isfile(file_path):
        # Get the file size
        file_size = os.path.getsize(file_path)
        print(f'File size: {file_size} bytes')

        # Get the last modification time
        last_modified = os.path.getmtime(file_path)
        print(f'Last modified: {last_modified}')

### File Paths and Directories
Python provides the os.path module for working with file paths and directories. You can use functions like os.path.join(), os.path.split(), os.path.dirname(), and os.path.basename() to manipulate file paths.

In [None]:
import os

# Join multiple path components
path = os.path.join('path', 'to', 'file.txt')  # path/to/file.txt

# Split a path into components
head, tail = os.path.split(path)  # ('path/to', 'file.txt')

# Get the directory name
dirname = os.path.dirname(path)  # 'path/to'

# Get the basename
basename = os.path.basename(path)  # 'file.txt'

## Example Program
Let's look at an example program that reads two numbers from a file 'input.txt', adds them, and writes the result to a file 'output.txt'.

In [None]:
# Open files for reading and writing
with open('input.txt', 'r') as fin, open('output.txt', 'w') as fout:
    # Read two numbers from the file
    a = int(fin.readline())
    b = int(fin.readline())
    
    # Write the sum of the numbers to the file
    fout.write(str(a + b))

## Error Handling
To safely handle files, you can use a 'try...except' block to catch potential errors.

In [None]:
try:
    with open('input.txt', 'r') as fin, open('output.txt', 'w') as fout:
        a = int(fin.readline())
        b = int(fin.readline())
        fout.write(str(a + b))
except FileNotFoundError:
    print("File not found.")
except ValueError:
    print("Error reading a number from the file.")

Working with files in Python is simple and intuitive. It's important to remember to close files after using them and to handle potential errors to ensure your program runs reliably.

You probably noticed all sorts of import math and import os and wondered what it is?

### MODULES IN PYTHON

A module is a standalone file containing Python code, which can define variables, functions, classes, and other Python objects. Modules are used to organize and reuse code across different parts of a program or across multiple programs.
A package is a collection of modules organized in a directory hierarchy. Packages allow for better organization and structuring of larger code bases by grouping related modules together.

#### Importing Modules and Packages

**Option 1:**
```python
import sys
sys.getsizeof(36)
```

**Option 2:**
```python
from sys import getsizeof
getsizeof(36)
```

The output for both options is `14`.

### PYTHON STANDARD LIBRARY

- **sys** - functions and constants for interacting with the interpreter
- **datetime** – handling date and time
- **os** - interface to basic operating system services
- **os.path** – platform-independent file path manipulation
- **string** – extended string operations
- **re** – working with regular expressions
- **csv** – working with CSV (Comma Separated Values) tabular format
- **gzip** – creating and working with .gzip archives
- **zipfile** – creating and working with .zip archives
- **ConfigParser** - working with configuration (ini) files
- And many others (hashlib, sockets, smtplib, sqlite3, etc.)

For more information:
- [Official Documentation](https://docs.python.org/3/library/index.html)
- [Wikipedia Article](https://en.wikipedia.org/wiki/Python_Standard_Library)

The springboard for exploring new topics and the turning point is the **functions**.

# FUNCTIONS
Functions are named blocks of code (subroutines) that can be called from another part of a program.
Syntax: 
```
def function_name(parameters):
    ...
    function body
    ...
    return return_value
```

**Arguments** of a function are the data that you pass to the function to perform the operation.

*The *Body** of a function is a block of code that performs a specific task.

**The return value** is what the function returns after performing the operation.

**Function parameters** are the names of the variables that you use in the function definition to work with the passed data (arguments).

**A function call** is a place in the code where you use the function name and pass values for the arguments.

In [None]:
# Example
def hello(n):
    print("Hello, I am a function")
    print(n)
    
# Function Call 
def main():
    print("Now we will call the function")
    hello(5)

**Return** operator

In [None]:
def test():
    print("Returning a value")
    return 667
    
#To call function:
a = test()

How to call this function?

In [None]:
def multi_ret():
    print("This curious function")
    print("returns three values")
    return 1, 'test', [796, 69, 15]
#This way
a, b, c = multi_ret()

**What happens when we call a function?**

Arguments `num1` and `num2` are passed to the function and assigned to the local variables `a` and `b` respectively.

`a = num1, b = num2`

Next, transformations are performed on these local variables.

After the `return` keyword, we specify what value the function should return.

So, `sum_nums_sqrt = ans`

(Lines after the `return` statement are not executed, acting like a `break` for loops)

The order of the passed arguments is important! Let's look at another function and compare the results.

In [None]:
def sum_sqrt(a, b):
    return a**2 + b**2

num1 = 5
num2 = 2
print(sum_sqrt(num1, num2))  # 5**2 + 2**2 = 25 + 4
print(sum_sqrt(num2, num1))  # 2**2 + 5**2 = 4 + 25

We can also set default values for parameters, meaning that we can input fewer data, and the remaining ones will be assigned default values. Arguments with default values are called **keyword arguments**, while the others are called **positional arguments**.

In [None]:
def salute(person, salutation="Howdy"):
    return f"{salutation}, {person}!"

print(salute("Charlie"))
print(salute("Daisy", "Greetings"))


But attention, we always put the parameters with the default value at the end, otherwise there will be an error

In [14]:
def salute(salutation="Howdy", person):
    return f"{salutation}, {person}!"

print(salute("Charlie"))
print(salute("Daisy", "Greetings"))

SyntaxError: non-default argument follows default argument (2310667371.py, line 1)

Let's go back to functions with several parameters

In [17]:
# Function to calculate the total cost of items with a tax rate
def compute_total_cost(amount, units=3, vat_rate=0.08):
    subtotal = amount * units
    subtotal += subtotal * vat_rate
    return subtotal

print(compute_total_cost(20))
print(compute_total_cost(20, 5))
print(compute_total_cost(20, 5, 0.06))

64.8
108.0
106.0


But what if we want to use a specific parameter value while using defaults for others, or we don't remember the order of the parameters? We can explicitly specify parameter values.

In [None]:
print(compute_total_cost(20, vat_rate=0.06))

In [18]:
# The following won't work because 20 is a positional argument
print(compute_total_cost(vat_rate=0.06, 20))

SyntaxError: positional argument follows keyword argument (20307214.py, line 2)

In [None]:
# But we can rewrite it like this, and it will work
print(compute_total_cost(vat_rate=0.06, amount=20))

## WHEN THERE ARE A LOT OF PARAMETERS
`*args` and `**kwargs` are special parameters that can be used in Python to pass a variable number of arguments to a function. They allow a function to accept an arbitrary number of positional arguments and an arbitrary number of keyword arguments, respectively.

+ `*args`:
- The `*args` parameter allows you to pass an arbitrary number of positional arguments to a function.
- The `*` symbol before the parameter name indicates that all arguments passed after it will be collected into a tuple and passed to this parameter. It's as if we're unpacking the tuple.

+ `**kwargs`:
- The `**kwargs` parameter allows you to pass an arbitrary number of keyword arguments to a function.
- The `**` symbol before the parameter name indicates that all keyword arguments passed will be collected into a dictionary and passed to this parameter. It's as if we're unpacking the dictionary, which turns it into a list of key-value pairs, and then we unpack those pairs again.

Arbitrary Number of Parameters

In [None]:
def f1(*args):
    for item in args:
        print(item)

f1('hello', 'world', 42, True)

Arbitrary Number of Keyword Parameters

In [22]:
def f2(**kwargs):
    for key, value in kwargs.items():
        print(f"{key} = {value}")

# This will raise an error because f2 only accepts keyword arguments
#f2(3, 'asd')

f2(asdf='hello', qwerty=42)

asdf = hello
qwerty = 42


You can pass both `*args` and `**kwargs`, and ordinary positional and name arguments.

In [21]:
def f1(*args, **kwargs):
    for item in args:
        print(item)
    for key, value in kwargs.items():
        print(f"{key} = {value}")

f1('hello', 'world', 42, True, asdf='hello', qwerty=42)

def f2(name, age=30, *args, **kwargs):
    print(f"name: {name}, age: {age}")
    for item in args:
        print(item)
    for key, value in kwargs.items():
        print(f"{key} = {value}")

f2('Alice', 25, 'Python', 'Developer', city='NYC', language='English')


hello
world
42
True
asdf = hello
qwerty = 42
name: Alice, age: 25
Python
Developer
city = NYC
language = English


## Lambda functions

When functions consist of only a single line with `return`, it's very convenient to use lambda functions.

For example:

```python
sum = lambda a, b, c: a + b + c
print(sum(1, 3, 5))  # Output: 9
```

This lambda function takes three arguments `a`, `b`, and `c`, and returns their sum.

Alternatively, you could write this as a regular function:

```python
def sum(a, b, c):
    return a + b + c

print(sum(1, 3, 5))  # Output: 9
```

However, when the function logic is simple and can be expressed in a single line, the lambda function syntax is more concise and readable.

In [None]:
# Regular function
def square(x):
    return x ** 2

# Lambda function
square = lambda x: x ** 2

print(square(5))  # Output: 25

In [None]:
multiply = lambda x, y: x * y
print(multiply(3, 4))  # Output: 12

In [None]:
# Lambda function to sort a list of tuples by the second element
students = [('John', 22), ('Emily', 19), ('Michael', 21), ('Jessica', 20)]
sorted_students = sorted(students, key=lambda x: x[1])
print(sorted_students)  # Output: [('Emily', 19), ('Jessica', 20), ('Michael', 21), ('John', 22)]