# File Handling, Iterator and Generator

## File Handling

File handling in Python refers to the ability to manipulate files on a computer, such as creating, reading, updating, and deleting files. It involves using functions and methods from the built-in "os" and "open()" modules. The "open()" function is used to open a file and returns a file object, which provides methods for reading, writing, and closing the file. The "os" module provides functions for interacting with the file system, such as renaming, deleting, and creating directories.

r: open an existing file for a read operation.

w: open an existing file for a write operation. If the file already contains some data then it will be overridden but if the file is not present then it creates the file as well.

a:  open an existing file for append operation. It won’t override existing data.

 r+:  To read and write data into the file. The previous data in the file will be overridden.

w+: To write and read data. It will override existing data.

a+: To append and read data from the file. It won’t override existing data.

In [5]:
pwd() # current working directory.

'd:\\code files\\Ineron FSDS 2.0\\study'

In [6]:
%ls # Gives list of files in current directory.

 Volume in drive D has no label.

File Not Found



 Volume Serial Number is F02F-A4BF

 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study


 Directory of d:\code files\Ineron FSDS 2.0\study



If you are going to open a file within the dir then just add name of the file otherwise you have to specify directory.


open() function provides several modes for file handling 



### Write:

1. 'w' (write-only): opens the file for writing, but you cannot read from it. If the file already exists, its content will be overwritten. If the file does not exist, it will be created.

In [7]:
f = open("test1.txt", "w")
f.write("Hello, I am data scientist. I am excited to solve real world problems using my data skills.")
f.close() # It is good habit to close the file so it will not mess up the code.

### Read:


2. 'r' (read-only): opens the file for reading, but you cannot write to it. If the file does not exist, a FileNotFoundError is raised.

In [8]:
f = open("test1.txt", "r")
f.read()

'Hello, I am data scientist. I am excited to solve real world problems using my data skills.'

In [9]:
f.read() # After reading for the first time, cursor will go the end of the file.

''

In [10]:
f.seek(0) # we can move the cursor to the beginning by giving position

0

In [11]:
f.read() # cursor will read from idex 0 i.e. start of the file

'Hello, I am data scientist. I am excited to solve real world problems using my data skills.'

In [12]:
f.seek(0)
f.read(13) # It can read specified number of indexes

'Hello, I am d'

In [13]:
f.seek(0)
f.readline() 

'Hello, I am data scientist. I am excited to solve real world problems using my data skills.'

In [14]:
f.close()

In [15]:
f = open("test2.txt", "w")
f.write("line 1\nline 2\nline 3\nline 4\nline 5\nline 6\nline 7\nline 8\n")
f.close()

In [16]:
# Here now we have created file test2.txt with multiple lines in it.
# We can read that data line by line
f = open("test2.txt", 'r')
f.readline()

'line 1\n'

In [17]:
f.readline()

'line 2\n'

In [18]:
f.readline() # used for line by line reading

'line 3\n'

In [19]:
f = open("test2.txt", "r")
f.seek(0)
f.readlines() # read entire file line by line and gives output as whole file



['line 1\n',
 'line 2\n',
 'line 3\n',
 'line 4\n',
 'line 5\n',
 'line 6\n',
 'line 7\n',
 'line 8\n']

In [20]:
f.close()

#### Read Vs Readlines

The readline and readlines methods in Python are used for reading data from a file. However, they differ in the way they return the data.

**readline** is a method that returns a single line of the file as a string. It can be used in a loop to read the entire file line by line.

VS

**readlines** is a method that reads the entire file and returns it as a list of strings, where each string is a line from the file.

In general, readline is more memory efficient than readlines if you're working with large files, since it reads and returns one line at a time. However, readlines can be more convenient for small files since it returns the entire contents of the file in a single list.

### Append:

'a' (append): opens the file for writing, but appends data to the end of the file instead of overwriting its content. If the file does not exist, it will be created.

In [21]:
f = open("test2.txt", "a")
f.write("Hi, append me") # Does not overwrite existing file.
f.close()

In [22]:
f = open("test3.txt", "a")
f.write("Hi\n How are you?\n")
f.close()

### Exclusive:

'x' (exclusive creation): creates a new file for writing, but raises a FileExistsError if the file already exists

In [23]:
f = open("test4.txt", "x")
f.write("I am one of kind and you can't replace me")
f.close()

FileExistsError: [Errno 17] File exists: 'test4.txt'

In [None]:
f = open("test4.txt", "x") # error: FileExistsError: [Errno 17] File exists: 'test4.txt'
f.write("It's useless to even try...")
f.close()

FileExistsError: [Errno 17] File exists: 'test4.txt'

In [None]:
f = open("test4.txt", "a")
f.write("Don't feel sad at least You can append me anytime :)")
f.close()

In [None]:
# we can use with keyword to automatically close the file 
with open("test4.txt", "r") as f:
    x = f.read()
print(x)

I am one of kind and you can't replace meDon't feel sad at least You can append me anytime :)


## Iterator 

An iterator is an object that contains a countable number of values.

An iterator is an object that can be iterated upon, meaning that you can traverse through all the values.

Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__().



__iter__(): The iter() method is called for the initialization of an iterator. This returns an iterator object


__next__(): The next method returns the next value for the iterable. When we use a for loop to traverse any iterable object, internally it uses the iter() method to get an iterator object, which further uses the next() method to iterate over. This method raises a StopIteration to signal the end of the iteration.

iterable objects are string, list, tuple, set and dictionary.


In [28]:
# list 
my_list = [1, 2, 3, 4]
for i in my_list:
    print(i)

1
2
3
4


In [None]:
dir(my_list)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

![image.png](attachment:image.png)


Here we can notice ```__iter()__ ``` method. 
We can say that list is iterable.

"iterable" is an object capable of returning its elements one by one. This can be a sequence, like a list or a string, or a more abstract collection, like a dictionary or a set.

Iterator : An iterator is an object that implements the `__next__` method and can be used to access elements one at a time. When there are no more elements to return, the iterator raises a `StopIteration` exception to signal that it has reached the end of the sequence.

By using `__iter__` we can convert iterable objects into Iterator

In [31]:
print(my_list)

[1, 2, 3, 4]


In [41]:
# lets convert it into Iterator 
my_iter = iter(my_list)
print(my_iter) # we have successfully converted the list into iterator.

<list_iterator object at 0x00000248F4EF41F0>


In [48]:
type(my_iter)

list_iterator

we can use `__next__` to iterate. We then used the next function to access elements one at a time. When there were no more elements to return, the iterator raised a StopIteration exception. 

In [42]:
next(my_iter)

1

In [43]:
next(my_iter)

2

In [44]:
next(my_iter)

3

In [45]:
next(my_iter)

4

In [47]:
next(my_iter) # Gives StopIteration exception when iteration reaches the end

StopIteration: 

In [50]:
dir(my_iter)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__length_hint__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

![image.png](attachment:image.png)


In Iterator, with `__Iter__` we also have `__next__`.


**Iterator Vs Iterable** 

In Python, an "iterable" is an object that can be used to produce an iterator, which is an object that implements the `__next__` method and can be used to access elements one at a time. An iterable can be any object that implements the `__iter__` method or the `__getitem__` method.

An iterator, on the other hand, is an object that can be used to access elements one at a time. It's created from an iterable by passing the iterable to the `iter()` function.


An iterable is an object capable of returning its elements one at a time. An example of an iterable is a `list`.

An iterator is an object that implements the iterator protocol, which consists of the methods `__iter__()` and `__next__()`. The `__iter__` method returns the iterator object itself, and the `__next__` method returns the next value from the iterator. When there are no more values to return, the `__next__ `method should raise a StopIteration exception.

In [51]:
fruits = ["apple", "banana", "mango", "orange"]
type(fruits)

list

In [58]:
# We will use fruit Iterable to create Iterator
fruit_iter = iter(fruits)
print(type(fruit_iter))


<class 'list_iterator'>


In [60]:
next(fruit_iter) # we can access all of the elements one by one

'banana'

In [63]:
# When all of the items are over it gives StopIteration exception.
next(fruit_iter)

StopIteration: 

## Generator:

A generator is a special type of iterator in Python. It is a way to create an iterator that generates values on-the-fly as you iterate over it, instead of having to generate all the values beforehand and store them in memory. Generators are defined using a special type of function called a generator function.

A generator function is defined just like a normal function, but instead of using the return statement to return a value, a generator function uses the yield statement. When a generator function is called, it returns a generator object, which you can then iterate over using a for loop or the `next()` function. Each time the `next()` function is called on the generator object, the generator function is executed up to the next `yield` statement, and the value of the yield expression is returned. When there are no more yield statements left in the generator function, a StopIteration exception is raised.

Here's an example of a generator function that generates the Fibonacci sequence:


 The advantage of using a generator is that you can generate an effectively unlimited sequence of values without having to store all of them in memory at once.






When the generator function is called, it does not execute the function body immediately. Instead, it returns a generator object that can be iterated over to produce the values.

In [64]:
def square(n):
    for i in range(n):
        return i ** 2


In [74]:
square(5) # Here we are not going out put for range but just only for first element 

0

The `yield` statement is used in Python to define a generator function. A generator function is a special type of function that generates values on-the-fly as you iterate over it, instead of having to generate all the values beforehand and store them in memory.

In [75]:
def square(n):
    for i in range(n):
        yield i ** 2

In [92]:
k = square(20)
type(k) # we have created a Generator object.

generator

In [91]:
next(k)

25

In [93]:
for i in k:
    print(i) 

0
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
361


In [6]:
# example:
list1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
iter1 = iter(list1)

In [7]:
while True:
    try:
        item = next(iter1)
        print(item)
    except Exception as e:
        print(e)
        break

1
2
3
4
5
6
7
8
9
10



source: ChatGPT 


In Python, iterable, iterator, and generator are related concepts that are used to create and manage sequences of values. Here's a brief overview of the differences between these concepts:

* Iterable: An iterable is any object in Python that can be used in a for loop or passed to the built-in iter function to get an iterator. In other words, an iterable is any object that implements the `__iter__` method or the `__getitem__` method. Examples of iterables include lists, tuples, dictionaries, and strings.

* Iterator: An iterator is an object that implements the `__iter__` method and the `__next__` method. The `__iter__` method returns the iterator object itself, and the `__next__` method returns the next value in the sequence each time it is called. When there are no more values to return, the `__next__` method raises the StopIteration exception. You can get an iterator from an iterable by passing it to the built-in iter function or by using a for loop.

* Generator: A generator is a special type of iterator that is defined using a generator function. A generator function is a function that uses the `yield` statement to generate values, instead of using the `return` statement. When a generator function is called, it returns a generator object, **which you can then iterate over**. Each time the `next()` function is called on the generator object, the generator function is executed up to the next yield statement, and the value of the yield expression is returned. When there are no more yield statements left in the generator function, a StopIteration exception is raised. Generators are a convenient way to generate sequences of values because they **allow you to generate values on-the-fly, as you iterate over them, instead of having to generate all the values beforehand and store them in memory.**

So, in summary, an `iterable` is any object that can be used to generate an `iterator`, and a `generator` is a special type of iterator that is defined using a generator function and generates values on-the-fly.