# Tuples

In [None]:
tup = (4, 5, 6)
tup

(4, 5, 6)

In many contexts, the parentheses can be omitted, so here we could also have written:

In [None]:
tup = 4, 5, 6
tup

(4, 5, 6)

You can convert any sequence or iterator to a tuple by invoking tuple:

In [None]:
tuple([4, 0, 2])
tup = tuple('string')
tup

('s', 't', 'r', 'i', 'n', 'g')

You can convert any sequence or iterator to a tuple by invoking tuple:

In [None]:
tup[0]

When you’re defining tuples within more complicated expressions, it’s often neces‐ sary to enclose the values in parentheses, as in this example of creating a tuple of tuples:

In [None]:
nested_tup = (4, 5, 6), (7, 8)
nested_tup
nested_tup[0]
nested_tup[1]

(7, 8)

While the objects stored in a tuple may be mutable themselves, once the tuple is
created it’s not possible to modify which object is stored in each slot:

In [None]:
tup = tuple(['foo', [1, 2], True])
tup[2] = False

TypeError: 'tuple' object does not support item assignment

If an object inside a tuple is mutable, such as a list, you can modify it in place:

In [None]:
tup[1].append(3)
tup

('foo', [1, 2, 3], True)

You can concatenate tuples using the + operator to produce longer tuples:

In [None]:
(4, None, 'foo') + (6, 0) + ('bar',)

(4, None, 'foo', 6, 0, 'bar')

Multiplying a tuple by an integer, as with lists, has the effect of concatenating that
many copies of the tuple:

In [None]:
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

### Unpacking tuples

If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign:

In [None]:
tup = (4, 5, 6)
a, b, c = tup
b

Even sequences with nested tuples can be unpacked:

In [None]:
tup = 4, 5, (6, 7)
a, b, (c, d) = tup
d

In [None]:
a, b = 1, 2
a
b
b, a = a, b
a
b

1

A common use of variable unpacking is iterating over sequences of tuples or lists:

In [None]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
    print(f'a={a}, b={b}, c={c}')

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


There are some situations where you may want to “pluck” a few elements from the beginning of a tuple. There is a special syntax that can do this, *rest, which is also used in function signatures to capture an arbitrarily long list of positional arguments:

In [None]:
values = 1, 2, 3, 4, 5
a, b, *rest = values
print(a)
print(b)
print(rest)

1
2
[3, 4, 5]


This rest bit is sometimes something you want to discard; there is nothing special about the rest name. As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [None]:
a, b, *_ = values

###Tuple methods

Since the size and contents of a tuple cannot be modified, it is very light on instance methods. A particularly useful one (also available on lists) is count, which counts the number of occurrences of a value:

In [None]:
a = (1, 2, 2, 2, 3, 4, 2)
a.count(2)

# List

In [None]:
a_list = [2, 3, 7, None]

tup = ("foo", "bar", "baz")
b_list = list(tup)
print(b_list)
b_list[1] = "peekaboo"
print(b_list)

['foo', 'bar', 'baz']
['foo', 'peekaboo', 'baz']


Lists and tuples are semantically similar (though tuples cannot be modified) and can
be used interchangeably in many functions.
The list built-in function is frequently used in data processing as a way to material‐ ize an iterator or generator expression:

In [None]:
gen = range(10)
gen
list(gen)

## Adding and removing elements

Elements can be appended to the end of the list with the append method:

In [None]:
b_list.append("dwarf")
b_list

['red', 'baz', 'dwarf']

Using insert you can insert an element at a specific location in the list:

In [None]:
b_list.insert(1, "red")
b_list

['red', 'red', 'baz', 'dwarf']

The insertion index must be between 0 and the length of the list, inclusive.

The inverse operation to insert is pop, which removes and returns an element at a particular index:

In [None]:
b_list.pop(2)
b_list

['red', 'red', 'dwarf']

Elements can be removed by value with remove, which locates the first such value and removes it from the list:

In [None]:
b_list.append("foo")
print(b_list)
b_list.remove("foo")
print(b_list)

['red', 'red', 'dwarf', 'foo']
['red', 'red', 'dwarf']


If performance is not a concern, by using append and remove, you can use a Python list as a set-like data structure (although Python has actual set objects, discussed later).

Check if a list contains a value using the in keyword:

In [None]:
"dwarf" in b_list

True

In [None]:
"dwarf" not in b_list

False

Checking whether a list contains a value is a lot slower than doing so with diction‐ aries and sets (to be introduced shortly), as Python makes a linear scan across the values of the list, whereas it can check the others (based on hash tables) in constant time.

## Concatenating and combining lists

Similar to tuples, adding two lists together with + concatenates them:

In [None]:
[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

If you have a list already defined, you can append multiple elements to it using the
extend method:

In [None]:
x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])
x

Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable.

So:
```
everything = []
for chunk in list_of_lists:
  everything.extend(chunk)
```
is faster than the concatenative alternative:
```
everything = []
for chunk in list_of_lists:
  everything = everything + chunk
```

#Sorting

You can sort a list in place (without creating a new object) by calling its sort function:

In [None]:
a = [7, 2, 5, 1, 3]
a.sort()
a

sort has a few options that will occasionally come in handy. One is the ability to pass a secondary sort key—that is, a function that produces a value to use to sort the objects. For example, we could sort a collection of strings by their lengths:

In [34]:
b = ["saw", "small", "He", "foxes", "six"]
b.sort(key=len)
print(b)

b2 = ["saw", "small", "He", "foxes", "six"]
b2.sort()
print(b2)

['He', 'saw', 'six', 'small', 'foxes']
['He', 'foxes', 'saw', 'six', 'small']


Soon, we’ll look at the sorted function, which can produce a sorted copy of a general sequence.

###Slicing

You can select sections of most sequence types by using slice notation, which in its basic form consists of start:stop passed to the indexing operator []:

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[1:5]

Slices can also be assigned with a sequence:

In [None]:
seq[3:5] = [6, 3]
seq

While the element at the start index is included, the stop index is not included, so
that the number of elements in the result is stop - start.
Either the start or stop can be omitted, in which case they default to the start of the
sequence and the end of the sequence, respectively:

In [None]:
seq[:5]
seq[3:]

Negative indices slice the sequence relative to the end:

In [35]:
seq[-4:]
seq[-6:-2]

[(1, 2, 3)]

A step can also be used after a second colon to, say, take every other element:

In [None]:
seq[::2]

A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:

In [None]:
seq[::-1]

#Dictionary

In [37]:
empty_dict = {}
d1 = {"a": "some value", "b": [1, 2, 3, 4]}
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

You can access, insert, or set elements using the same syntax as for accessing elements of a list or tuple:

In [38]:
d1[7] = "an integer"
d1
d1["b"]

[1, 2, 3, 4]

You can check if a dictionary contains a key using the same syntax used for checking whether a list or tuple contains a value:

In [None]:
"b" in d1

You can delete values using either the del keyword or the pop method (which simultaneously returns the value and deletes the key):

In [39]:
d1[5] = "some value"
print(d1)
d1["dummy"] = "another value"
print(d1)
del d1[5]
print(d1)
ret = d1.pop("dummy")
print(ret)
print(d1)

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value'}
{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value', 'dummy': 'another value'}
{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 'dummy': 'another value'}
another value
{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}


The keys and values method gives you iterators of the dictionary’s keys and values, respectively. The order of the keys depends on the order of their insertion, and these functions output the keys and values in the same respective order:

In [None]:
list(d1.keys())
list(d1.values())

If you need to iterate over both the keys and values, you can use the items method to iterate over the keys and values as 2-tuples:

In [None]:
list(d1.items())


You can merge one dictionary into another using the update method:

In [None]:
d1.update({"b": "foo", "c": 12})
d1

The update method changes dictionaries in place, so any existing keys in the data passed to update will have their old values discarded.

###Creating dictionaries from sequences

In [40]:
tuples = zip(range(5), reversed(range(5)))
print(tuples)
mapping = dict(tuples)
print(mapping)


<zip object at 0x79da80217440>
{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}


In [41]:
words = ["apple", "bat", "bar", "atom", "book"]
by_letter = {}

for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

The setdefault dictionary method can be used to simplify this workflow. The preceding for loop can be rewritten as:

In [None]:
by_letter = {}
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)
by_letter

The built-in collections module has a useful class, defaultdict, which makes this even easier. To create one, you pass a type or function for generating the default value for each slot in the dictionary:

In [None]:
from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
    by_letter[word[0]].append(word)

#Valid dictionary key types

While the values of a dictionary can be any Python object, the keys generally have to be immutable objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too). The technical term here is hashability. You can check whether an object is hashable (can be used as a key in a dictionary) with the hash function:

In [2]:
print(hash("string"))
print(hash((1, 2, (2, 3))))
print(hash((1, 2, [2, 3]))) # fails because lists are mutable

-4220075762722178930


TypeError: unhashable type: 'list'

The hash values you see when using the hash function in general will depend on the Python version you are using.
To use a list as a key, one option is to convert it to a tuple, which can be hashed as long as its elements also can be:

In [None]:
d = {}
d[tuple([1, 2, 3])] = 5
d

#Set

In [None]:
set([2, 2, 2, 1, 3, 3])
{2, 2, 2, 1, 3, 3}

Sets support mathematical set operations like union, intersection, difference, and symmetric difference. Consider these two example sets:

In [4]:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

In [5]:
print(a.union(b))
print(a | b)

{1, 2, 3, 4, 5, 6, 7, 8}
{1, 2, 3, 4, 5, 6, 7, 8}


The intersection contains the elements occurring in both sets. The & operator or the intersection method can be used:

In [6]:
print(a.intersection(b))
print(a & b)

{3, 4, 5}
{3, 4, 5}


All of the logical set operations have in-place counterparts, which enable you to replace the contents of the set on the left side of the operation with the result. For very large sets, this may be more efficient:

In [None]:
c = a.copy()
c |= b
c
d = a.copy()
d &= b
d

Like dictionary keys, set elements generally must be immutable, and they must be hashable (which means that calling hash on a value does not raise an exception). In order to store list-like elements (or other mutable sequences) in a set, you can convert them to tuples:

In [None]:
my_data = [1, 2, 3, 4]
my_set = {tuple(my_data)}
my_set

You can also check if a set is a subset of (is contained in) or a superset of (contains all elements of) another set:

In [None]:
a_set = {1, 2, 3, 4, 5}
{1, 2, 3}.issubset(a_set)
a_set.issuperset({1, 2, 3})

Sets are equal if and only if their contents are equal:

In [8]:
{1, 2, 3} == {3, 2, 1}

True

#Built-In Sequence Functions
###enumerate
It’s common when iterating over a sequence to want to keep track of the index of the current item. A do-it-yourself approach would look like:
```
index = 0
for value in collection:
       # do something with value
      index += 1
```
Since this is so common, Python has a built-in function, enumerate, which returns a sequence of (i, value) tuples:
```
for index, value in enumerate(collection):
       # do something with value
```

###sorted

The sorted function returns a new sorted list from the elements of any sequence:

In [7]:
sorted([7, 1, 2, 6, 0, 3, 2])
sorted("horse race")

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

###zip

zip “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples:

In [9]:
seq1 = ["foo", "bar", "baz"]
seq2 = ["one", "two", "three"]
zipped = zip(seq1, seq2)
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

zip can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence:

In [10]:
seq3 = [False, True]
list(zip(seq1, seq2, seq3))

[('foo', 'one', False), ('bar', 'two', True)]


A common use of zip is simultaneously iterating over multiple sequences, possibly also combined with enumerate:

In [11]:
for index, (a, b) in enumerate(zip(seq1, seq2)):
    print(f"{index}: {a}, {b}")


0: foo, one
1: bar, two
2: baz, three


###reversed

reversed iterates over the elements of a sequence in reverse order:

In [None]:
list(reversed(range(10)))

#List, Set, and Dictionary Comprehensions
List comprehensions are a convenient and widely used Python language feature. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter into one concise expression. They take the basic form:

```
[expr for value in collection if condition]
```
This is equivalent to the following for loop:
```
    result = []
    for value in collection:
      if condition:
      result.append(expr)
```

Task: take out all word longer than 2 char, convert to upper case, add to new list

In [12]:
strings = ["a", "as", "bat", "car", "dove", "python"]
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Set exampe

In [13]:
unique_lengths = {len(x) for x in strings}
unique_lengths

{1, 2, 3, 4, 6}

We could also express this more functionally using the map function, introduced shortly:

In [None]:
set(map(len, strings))

As a simple dictionary comprehension example, we could create a lookup map of these strings for their locations in the list:

In [14]:
loc_mapping = {value: index for index, value in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}


###Nested list comprehensions
Suppose we have a list of lists containing some English and Spanish names:

In [16]:
all_data = [["John", "Emily", "Michael", "Mary", "Steven"],
            ["Maria", "Juan", "Javier", "Natalia", "Pilar"]]

Suppose we wanted to get a single list containing all names with two or more a’s in them. We could certainly do this with a simple for loop:

In [17]:
names_of_interest = []
for names in all_data:
    enough_as = [name for name in names if name.count("a") >= 2]
    names_of_interest.extend(enough_as)
names_of_interest

['Maria', 'Natalia']

You can actually wrap this whole operation up in a single nested list comprehension, which will look like:

In [18]:
result = [name for names in all_data for name in names
          if name.count("a") >= 2]
result

['Maria', 'Natalia']

At first, nested list comprehensions are a bit hard to wrap your head around. The for parts of the list comprehension are arranged according to the order of nesting, and any filter condition is put at the end as before. Here is another example where we “flatten” a list of tuples of integers into a simple list of integers:

In [None]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

Keep in mind that the order of the for expressions would be the same if you wrote a nested for loop instead of a list comprehension:

In [None]:
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)

You can have arbitrarily many levels of nesting, though if you have more than two or three levels of nesting, you should probably start to question whether this makes sense from a code readability standpoint. It’s important to distinguish the syntax just shown from a list comprehension inside a list comprehension, which is also perfectly valid:

In [None]:
[[x for x in tup] for tup in some_tuples]

#Function

In [19]:
def my_function(x, y):
    return x + y

In [20]:
my_function(1, 2)
result = my_function(1, 2)
result

3

There is no issue with having multiple return statements. If Python reaches the end of a function without encountering a return statement, None is returned automati‐ cally. For example:

In [21]:
def function_without_return(x):
    print(x)

result = function_without_return("hello!")
print(result)

hello!
None


Each function can have positional arguments and keyword arguments. Keyword argu‐ ments are most commonly used to specify default values or optional arguments. Here we will define a function with an optional z argument with the default value 1.5:

In [22]:
def my_function2(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)

While keyword arguments are optional, all positional arguments must be specified
when calling a function.
You can pass values to the z argument with or without the keyword provided, though using the keyword is encouraged:

In [None]:
my_function2(5, 6, z=0.7)
my_function2(3.14, 7, 3.5)
my_function2(10, 20)

The main restriction on function arguments is that the keyword arguments must follow the positional arguments (if any). You can specify keyword arguments in any order. This frees you from having to remember the order in which the function arguments were specified. You need to remember only what their names are.

In [23]:
def func():
  a=[]
  for i in range(5):
      a.append(i)


When func() is called, the empty list a is created, five elements are appended, and then a is destroyed when the function exits. Suppose instead we had declared a as follows:

In [24]:
a = []
def func():
    for i in range(5):
        a.append(i)

Each call to func will modify list a:

In [25]:
func()
print(a)
func()
print(a)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]


Assigning variables outside of the function’s scope is possible, but those variables
must be declared explicitly using either the global or nonlocal keywords:

In [26]:
a = None
def bind_a_variable():
    global a
    a = []
bind_a_variable()
print(a)

[]


In [27]:
a = None
def bind_a_variable():
    # global a
    a = []
bind_a_variable()
print(a)

None



Returning Multiple Values
When I first programmed in Python after having programmed in Java and C++, one of my favorite features was the ability to return multiple values from a function with simple syntax. Here’s an example:
```
def f():
  a=5
  b=6
  c=7
  return a, b, c

a,b,c=f()
```


# Functions Are Objects

In [28]:
states = ["   Alabama ", "Georgia!", "Georgia", "georgia", "FlOrIda",
          "south   carolina##", "West virginia?"]

In [29]:
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip() #  The strip() method removes any leading, and trailing whitespaces.
        value = re.sub("[!#?]", "", value) # remove non letter
        value = value.title() # The title() method returns a string where the first character in every word is upper case.
        result.append(value)
    return result

In [30]:
clean_strings(states)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South   Carolina',
 'West Virginia']

### Another way of writing to use function as objects

In [31]:
def remove_punctuation(value):
    return re.sub("[!#?]", "", value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = []
    for value in strings:
        for func in ops:
            value = func(value)
        result.append(value)
    return result

In [32]:
clean_strings(states, clean_ops)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South   Carolina',
 'West Virginia']


You can use functions as arguments to other functions like the built-in map function, which applies a function to a sequence of some kind:

In [None]:
for x in map(remove_punctuation, states):
    print(x)

### Anonymous (Lambda) Functions

In [None]:
def short_function(x):
    return x * 2

equiv_anon = lambda x: x * 2

They are especially convenient in data analysis because, as you’ll see, there are many cases where data transformation functions will take functions as arguments. It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declara‐ tion or even assigning the lambda function to a local variable. Consider this example:

In [None]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]

ints = [4, 0, 1, 5, 6]
apply_to_list(ints, lambda x: x * 2)

In [None]:
strings = ["foo", "card", "bar", "aaaa", "abab"]

Here we could pass a lambda function to the list’s sort method:

In [None]:
strings.sort(key=lambda x: len(set(x)))
strings

###Generators

Many objects in Python support iteration, such as over objects in a list or lines in a file. This is accomplished by means of the iterator protocol, a generic way to make objects iterable. For example, iterating over a dictionary yields the dictionary keys:

In [33]:
some_dict = {"a": 1, "b": 2, "c": 3}
for key in some_dict:
    print(key)

a
b
c


When you write for key in some_dict, the Python interpreter first attempts to create an iterator out of some_dict:

In [34]:
dict_iterator = iter(some_dict)
dict_iterator

<dict_keyiterator at 0x7a49439eef20>

An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop. Most methods expecting a list or list-like object will also accept any iterable object. This includes built-in methods such as min, max, and sum, and type constructors like list and tuple:

In [35]:
list(dict_iterator)

['a', 'b', 'c']

A generator is a convenient way, similar to writing a normal function, to construct a new iterable object. Whereas normal functions execute and return a single result at a time, generators can return a sequence of multiple values by pausing and resuming execution each time the generator is used. To create a generator, use the yield keyword instead of return in a function:

In [None]:
def squares(n=10):
    print(f"Generating squares from 1 to {n ** 2}")
    for i in range(1, n + 1):
        yield i ** 2

When you actually call the generator, no code is immediately executed:

In [None]:
gen = squares()
gen

It is not until you request elements from the generator that it begins executing its
code:

In [None]:
for x in gen:
    print(x, end=" ")


### Generator expressions

Another way to make a generator is by using a generator expression. This is a genera‐ tor analogue to list, dictionary, and set comprehensions. To create one, enclose what would otherwise be a list comprehension within parentheses instead of brackets:

In [None]:
gen = (x ** 2 for x in range(100))
gen

In [None]:
sum(x ** 2 for x in range(100))
dict((i, i ** 2) for i in range(5))

In [None]:
import itertools
def first_letter(x):
    return x[0]

names = ["Alan", "Adam", "Wes", "Will", "Albert", "Steven"]

for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names)) # names is a generator

In [None]:
float("1.2345")
float("something")

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x

In [None]:
attempt_float("1.2345")
attempt_float("something")

In [None]:
float((1, 2))

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except ValueError:
        return x

In [None]:
attempt_float((1, 2))

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

# Files and the Operating System

### This part is only needed if your file is in google drive
preparation step:
1.   open a folder called 'analytics_programming'
2.   put all files you need for this class in it





In [36]:
# how colab open files:
# https://saturncloud.io/blog/how-to-use-google-colab-to-work-with-local-files/
# https://stackoverflow.com/questions/48376580/how-to-read-data-in-google-colab-from-my-google-drive

from google.colab import drive
drive.mount('/content/drive', force_remount=True)
google_drive_path_header = '/content/drive/MyDrive/analytics_programming'

Mounted at /content/drive


In [46]:
path = google_drive_path_header + '/headers.txt'
print(path)
f = open(path, encoding="utf-8")

/content/drive/MyDrive/analytics_programming/headers.txt


Here, I pass encoding="utf-8" as a best practice because the default Unicode encod‐ ing for reading files varies from platform to platform.
By default, the file is opened in read-only mode "r". We can then treat the file object f like a list and iterate over the lines like so:

In [47]:
lines = []
for line in f:
   lines.append(line)

print(lines)

['datetime,Date,time,ticker,mid,sector,industrygroup']


In [None]:
lines = [x.rstrip() for x in open(path, encoding="utf-8")]
lines

When you use open to create file objects, it is recommended to close the file when you are finished with it. Closing the file releases its resources back to the operating system:

In [49]:
f.close()

In [None]:
with open(path, encoding="utf-8") as f:
    lines = [x.rstrip() for x in f]

In [None]:
f1 = open(path)
f1.read(10)
f2 = open(path, mode="rb")  # Binary mode
f2.read(10)

In [None]:
f1.tell()
f2.tell()

In [None]:
import sys
sys.getdefaultencoding()

In [None]:
f1.seek(3)
f1.read(1)
f1.tell()

In [None]:
f1.close()
f2.close()

In [None]:
path

with open("tmp.txt", mode="w") as handle:
    handle.writelines(x for x in open(path) if len(x) > 1)

with open("tmp.txt") as f:
    lines = f.readlines()

lines

In [None]:
import os
os.remove("tmp.txt")

In [None]:
with open(path) as f:
    chars = f.read(10)

chars
len(chars)

In [None]:
with open(path, mode="rb") as f:
    data = f.read(10)

data

In [None]:
data.decode("utf-8")
data[:4].decode("utf-8")

In [None]:
sink_path = "sink.txt"
with open(path) as source:
    with open(sink_path, "x", encoding="iso-8859-1") as sink:
        sink.write(source.read())

with open(sink_path, encoding="iso-8859-1") as f:
    print(f.read(10))

In [None]:
os.remove(sink_path)

In [None]:
f = open(path, encoding='utf-8')
f.read(5)
f.seek(4)
f.read(1)
f.close()