In [None]:
# Data Visualization 
# BTech Computer Science Stream , January 2025
# Week 2 Python Language Basics - Demonstration Code
# Name: Manoj R, Reg Number , Date: 24/12/2024
# This Notebook demonstrates Python’s workhorse data structures: tuples, lists, dicts, and sets and discuss creating your own reusable Python functions 

Following naming conventions are used for Python's data structures

tuple -tup
Sequence-seq
list-variablename_list
dicts-dict  
sets-variablename_set




In [1]:
tup = (4, 5, 6)
tup

(4, 5, 6)

In [2]:
tup = 4, 5, 6
tup

(4, 5, 6)

In [None]:
defining tuples in more complicated expressions, it’s often necessary to enclose the values in parentheses, as in this example of
creating a tuple of tuples

In [21]:
tup=tuple([4, 0, 2])
print(tup[2])
tup = tuple('string')
print(tup)

2
('s', 't', 'r', 'i', 'n', 'g')


### tup[0]

In [None]:
Elements can be accessed with square brackets [] as with most other sequence types. As in C, C++, Java, and many other languages, 
sequences are 0-indexed in Python

In [14]:
nested_tup = (4, 5, 6), (7, 8)
print(nested_tup)
print(nested_tup[0])
print(nested_tup[1])

((4, 5, 6), (7, 8))
(4, 5, 6)
(7, 8)


In [None]:
the objects stored in a tuple may be mutable themselves, once the tuple is created it’s not possible to modify which object is stored in each
slot

In [6]:
tup = tuple(['foo', [1, 2], True])
tup[2] = False

TypeError: 'tuple' object does not support item assignment

In [None]:
If an object inside a tuple is mutable, such as a list, you can modify it in place

In [7]:
tup[1].append(3)
tup

('foo', [1, 2, 3], True)

In [None]:
You can concatenate tuples using the + operator to produce longer tuples:

In [9]:
(4, None, 'foo') + (6, 0) + ('bar',)

(4, None, 'foo', 6, 0, 'bar')

In [None]:
Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple

In [10]:
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

In [None]:
Unpacking tuples

If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign

In [11]:
tup = (4, 5, 6)
a, b, c = tup
b

5

In [None]:
sequences with nested tuples can be unpacked:

In [12]:
tup = 4, 5, (6, 7)
a, b, (c, d) = tup
d

7

In [None]:
Using this functionality you can easily swap variable names, a task which in many languages might look like:
tmp = a
a = b
b = tmp
But, in Python, the swap can be done like this:


In [2]:
a, b = 1, 2
a

1

In [3]:
b

2

In [4]:
b, a = a, b


In [5]:
a

2

In [6]:
b

1

In [None]:
A common use of variable unpacking is iterating over sequences of tuples or lists

In [7]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
    print(f'a={a}, b={b}, c={c}')

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


In [None]:
more advanced tuple unpacking to help with situations where you may want to “pluck” a few elements from the beginning of a tuple. 
This uses the special syntax *rest, which is also used in function signatures to capture an arbitrarily long list of positional arguments

In [1]:
values = 1, 2, 3, 4, 5
a, b, *rest = values
a
b
rest

[3, 4, 5]

In [None]:
This rest bit is sometimes something you want to discard; there is nothing special about the rest name.
As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [21]:
a, b, *_ = values

In [None]:
Tuple methods

Since the size and contents of a tuple cannot be modified, it is very light oninstance methods. A particularly useful one 
(also available on lists) is count, which counts the number of occurrences of a value:

In [22]:
a = (1, 2, 2, 2, 3, 4, 2)
a.count(2)

4

In [None]:
List

In contrast with tuples, lists are variable-length and their contents can be modified in-place. 
You can define them using square brackets [] or using the list type function:

In [23]:
a_list = [2, 3, 7, None]

tup = ("foo", "bar", "baz")
b_list = list(tup)
b_list
b_list[1] = "peekaboo"
b_list

['foo', 'peekaboo', 'baz']

In [None]:
Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.
    
The list function is frequently used in data processing as a way to materialize an iterator or generator expression:

In [24]:
gen = range(10)
gen
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
Adding and removing elements

Elements can be appended to the end of the list with the append method:

In [25]:
b_list.append("dwarf")
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

In [None]:
insert is computationally expensive compared with append, because references to subsequent elements have to be shifted internally to make room for the new element. If you need to insert
elements at both the beginning and end of a sequence, you may wish to explore collections.deque, a double-ended queue, for this purpose 

Using insert you can insert an element at a specific location in the list:

The insertion index must be between 0 and the length of the list, inclusive


In [33]:
b_list.insert(1, "red")
b_list

['foo', 'red', 'red', 'peekaboo', 'baz', 'dwarf']

In [None]:
The inverse operation to insert is pop, which removes and returns an element at a particular index

In [32]:
b_list.pop(2)
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

In [None]:
If performance is not a concern, by using append and remove, you can use a Python list as a set-like data structure

In [34]:
b_list.append("foo")
b_list
b_list.remove("foo")
b_list

['red', 'red', 'peekaboo', 'baz', 'dwarf', 'foo']

In [None]:
Check if a list contains a value using the in keyword:

Checking whether a list contains a value is a lot slower than doing so with dicts and sets , as Python makes a linear scan 
across the values of the list, whereas it can check the others (based on hash tables) in constant time.

In [35]:
"dwarf" in b_list

True

In [None]:
The keyword not can be used to negate in

In [36]:
"dwarf" not in b_list

False

In [None]:
Concatenating and combining lists

Similar to tuples, adding two lists together with + concatenates them:

In [37]:
[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

In [None]:
If you have a list already defined, you can append multiple elements to it using the extend method

In [38]:
x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])
x

[4, None, 'foo', 7, 8, (2, 3)]

In [None]:
Sorting

You can sort a list in-place (without creating a new object) by calling its sort function:

In [40]:
a = [7, 2, 5, 1, 3]
a.sort()
a

[1, 2, 3, 5, 7]

In [None]:
sort has a few options that will occasionally come in handy. One is themability to pass a secondary sort key—that is, a function t/hat produces a
value to use to sort the objects. For example, we could sort a collection ofstrings by their lengths:

In [41]:
b = ["saw", "small", "He", "foxes", "six"]
b.sort(key=len)
b

['He', 'saw', 'six', 'small', 'foxes']

In [None]:
Slicing

You can select sections of most sequence types by using slice notation, which in its basic form consists of start:stop passed to the indexing
operator []

In [42]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[1:5]

[2, 3, 7, 5]

In [None]:
Slices can also be assigned to with a sequence

In [43]:
seq[3:5] = [6, 3]
seq

[7, 2, 3, 6, 3, 6, 0, 1]

In [None]:
While the element at the start index is included, the stop index is not included, so that the number of elements in the result is stop - start.
Either the start or stop can be omitted, in which case they default to the start of the sequence and the end of the sequence, respectively

In [44]:
seq[:5]
seq[3:]

[6, 3, 6, 0, 1]

In [None]:
Negative indices slice the sequence relative to the end

In [45]:
seq[-4:]
seq[-6:-2]

[3, 6, 3, 6]

In [None]:
Slicing semantics takes a bit of getting used to, especially if you’re coming from R or MATLAB. 

the indices are shown at the “bin edges” to help show where the slice selections start and stop using positive or negative indices.
A step can also be used after a second colon to, say, take every other element:

In [46]:
seq[::2]

[7, 3, 3, 0]

In [47]:
seq[::-1]

[1, 0, 6, 3, 6, 3, 2, 7]

In [None]:
dict

dict may be the most important built-in Python data structure. In other programming languages, dicts are sometimes called hash maps or 
associative arrays. 
A dict is an unordered collection of key-value pairs,where key and value are Python objects.
Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key.
One approach for creating one is to use curly braces {} and colons to separate keys and values:

In [48]:
empty_dict = {}
d1 = {"a": "some value", "b": [1, 2, 3, 4]}
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [None]:
You can access, insert, or set elements using the same syntax as for accessing elements of a list or tuple

In [49]:
d1[7] = "an integer"
d1
d1["b"]

[1, 2, 3, 4]

In [None]:
You can check if a dict contains a key using the same syntax used for checking whether a list or tuple contains a value

In [50]:
"b" in d1

True

In [None]:
You can delete values either using the del keyword or the pop method (which simultaneously returns the value and deletes the key):

In [51]:
d1[5] = "some value"
d1
d1["dummy"] = "another value"
d1
del d1[5]
d1
ret = d1.pop("dummy")
ret
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [None]:
The keys and values method give you iterators of the dict’s keys and values, respectively. 
The order of the keys depends on the order of their insertion, and these functions output the keys and values in the same respective order

In [52]:
list(d1.keys())
list(d1.values())

['some value', [1, 2, 3, 4], 'an integer']

In [53]:
list(d1.items())

[('a', 'some value'), ('b', [1, 2, 3, 4]), (7, 'an integer')]

In [None]:
You can merge one dict into another using the update method

In [54]:
d1.update({"b": "foo", "c": 12})
d1

{'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

In [None]:
Creating dicts from sequences

It’s common to occasionally end up with two sequences that you want to pair up element-wise in a dict. As a first cut, 
you might write code like this:

In [55]:
tuples = zip(range(5), reversed(range(5)))
tuples
mapping = dict(tuples)
mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

In [None]:
Default values

It’s common to have logic like:

if key in some_dict:
value = some_dict[key]
else:
value = default_value

Thus, the dict methods get and pop can take a default value to be returned, so that the above if-else block can be written simply as:
value = some_dict.get(key, default_value)
get by default will return None if the key is not present, while pop will raise an exception. With setting values, 
it may be that the values in a dict are another kind of collection, like a list. 
For example, you could imagine categorizing a list of words by their first letters as a dict of lists:

In [56]:
words = ["apple", "bat", "bar", "atom", "book"]
by_letter = {}

for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [None]:
The setdefault dict method can be used to simplify this workflow. The preceding for loop can be rewritten as:
for word in words:
letter = word[0]
by_letter.setdefault(letter, []).append(word)

The built-in collections module has a useful class, defaultdict, which makes this even easier. To create one, you pass a type or function for
generating the default value for each slot in the dict

In [57]:
by_letter = {}
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [60]:
from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
    by_letter[word[0]].append(word)

In [None]:
Valid dict key types

While the values of a dict can be any Python object, the keys generally have to be immutable objects like 
scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too). The technical term here is hashability.
You can check whether an object is hashable (can be used as a key in a dict) with the hash function

In [61]:
hash("string")
hash((1, 2, (2, 3)))
hash((1, 2, [2, 3])) # fails because lists are mutable

TypeError: unhashable type: 'list'

In [None]:
To use a list as a key, one option is to convert it to a tuple, which can be hashed as long as its elements also can

In [62]:
d = {}
d[tuple([1, 2, 3])] = 5
d

{(1, 2, 3): 5}

In [None]:
set

A set is an unordered collection of unique elements. You can think of them like dict keys, but keys only, no values. 
A set can be created in two ways: via the set function or via a set literal with curly braces:

In [63]:
set([2, 2, 2, 1, 3, 3])
{2, 2, 2, 1, 3, 3}

{1, 2, 3}

In [64]:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

In [None]:
The union of these two sets is the set of distinct elements occurring in either set. 
This can be computed with either the union method or the | binary operator

In [65]:
a.union(b)
a | b

{1, 2, 3, 4, 5, 6, 7, 8}

In [None]:
The intersection contains the elements occurring in both sets.
The & operator or the intersection method can be used

In [66]:
a.intersection(b)
a & b

{3, 4, 5}

In [None]:
All of the logical set operations have in-place counterparts, which enable you to replace the contents of the set on the left side
of the operation with the result. For very large sets, this may be more efficient:

In [67]:
c = a.copy()
c |= b
c
d = a.copy()
d &= b
d

{3, 4, 5}

In [None]:
Like a dict’s keys, a set’s elements generally must be immutable, and they must be hashable (which means that calling hash on a value 
does not raise an exception). In order to store list-like elements (or other mutable sequences) in a set, you can convert them to tuples

In [69]:
my_data = [1, 2, 3, 4]
my_set = {tuple(my_data)}
my_set

{(1, 2, 3, 4)}

In [None]:
You can also check if a set is a subset of (is contained in) or a superset of (contains all elements of) another set

In [70]:
a_set = {1, 2, 3, 4, 5}
{1, 2, 3}.issubset(a_set)
a_set.issuperset({1, 2, 3})

True

In [None]:
Sets are equal if and only if their contents are equal

In [71]:
{1, 2, 3} == {3, 2, 1}

True

In [72]:
sorted([7, 1, 2, 6, 0, 3, 2])
sorted("horse race")

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

In [73]:
seq1 = ["foo", "bar", "baz"]
seq2 = ["one", "two", "three"]
zipped = zip(seq1, seq2)
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

In [74]:
seq3 = [False, True]
list(zip(seq1, seq2, seq3))

[('foo', 'one', False), ('bar', 'two', True)]

In [75]:
for index, (a, b) in enumerate(zip(seq1, seq2)):
    print(f"{index}: {a}, {b}")


0: foo, one
1: bar, two
2: baz, three


In [61]:
list(reversed(range(10)))

In [62]:
strings = ["a", "as", "bat", "car", "dove", "python"]
[x.upper() for x in strings if len(x) > 2]

In [63]:
unique_lengths = {len(x) for x in strings}
unique_lengths

In [64]:
set(map(len, strings))

In [65]:
loc_mapping = {value: index for index, value in enumerate(strings)}
loc_mapping

In [66]:
all_data = [["John", "Emily", "Michael", "Mary", "Steven"],
            ["Maria", "Juan", "Javier", "Natalia", "Pilar"]]

In [67]:
names_of_interest = []
for names in all_data:
    enough_as = [name for name in names if name.count("a") >= 2]
    names_of_interest.extend(enough_as)
names_of_interest

In [68]:
result = [name for names in all_data for name in names
          if name.count("a") >= 2]
result

In [69]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

In [70]:
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)

In [71]:
[[x for x in tup] for tup in some_tuples]

In [None]:
Functions

Functions are the primary and most important method of code organization and reuse in Python. As a rule of thumb, if you anticipate needing to 
repeat the same or very similar code more than once, it may be worth writing a reusable function. Functions can also help make your code 
more readable by giving a name to a group of Python statements.
    
Functions are declared with the def keyword. A function contains a block of code with an optional use of the with the return keyword

In [76]:
def my_function(x, y):
    return x + y

In [77]:
my_function(1, 2)
result = my_function(1, 2)
result

3

In [None]:
There is no issue with having multiple return statements. If Python reaches the end of a function without encountering a return statement, 
None is returned automatically. For example

In [78]:
def function_without_return(x):
    print(x)

result = function_without_return("hello!")
print(result)

hello!
None


In [79]:
def my_function2(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)

In [None]:
Each function can have positional arguments and keyword arguments.
Keyword arguments are most commonly used to specify default values or optional arguments.
In the preceding function, x and y are positional arguments while z is a keyword argument. This means that the function
can be called in any of these ways:

In [84]:
my_function2(5, 6, z=0.7)
my_function2(3.14, 7, 3.5)
my_function2(10, 20)

45.0

In [None]:
Namespaces, Scope, and Local Functions

Functions can access variables created inside the function as well as those outside the function in higher (or even global) scopes. 
An alternative and more descriptive name describing a variable scope in Python is a namespace. 
Any variables that are assigned within a function by default are assigned to the local namespace. 
The local namespace is created when the function is called and immediately populated by the function’s arguments.
After the function is finished, the local namespace is destroyed (with some exceptions that are outside the purview of this chapter). Consider the
following function:
When func() is called, the empty list a is created, five elements are appended, and then a is destroyed when the function exits. Suppose instead
we had declared a as follows:

In [77]:
a = []
def func():
    for i in range(5):
        a.append(i)

In [None]:
Each call to func will modify the list a

In [85]:
func()
a
func()
a

NameError: name 'func' is not defined

In [None]:
Assigning variables outside of the function’s scope is possible, but those variables must be declared explicitly either using the global the global or
nonlocal keywords:
nonlocal allows a function to modify variables defined in a higher level scope that is not global. 

In [86]:
a = None
def bind_a_variable():
    global a
    a = []
bind_a_variable()
print(a)

[]


In [None]:
Functions Are Objects

Since Python functions are objects, many constructs can be easily expressed that are difficult to do in other languages. Suppose we were doing some
data cleaning and needed to apply a bunch of transformations to the following list of strings:

In [None]:
Functions Are Objects

Since Python functions are objects, many constructs can be easily expressed that are difficult to do in other languages. Suppose we were doing some
data cleaning and needed to apply a bunch of transformations to the following list of strings:

In [80]:
states = ["   Alabama ", "Georgia!", "Georgia", "georgia", "FlOrIda",
          "south   carolina##", "West virginia?"]

In [9]:
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub("[!#?]", "", value)
        value = value.title()
        result.append(value)
    return result

In [11]:
clean_strings(states)

NameError: name 'states' is not defined

In [12]:
Anyone who has ever worked with user-submitted survey data has seen messy results like these.
Lots of things need to happen to make this list of strings uniform and ready for analysis: stripping whitespace, removing
punctuation symbols, and standardizing on proper capitalization. 
One way to do this is to use built-in string methods along with the re standard library module for regular expressions:

SyntaxError: invalid syntax (1638469656.py, line 1)

In [83]:
def remove_punctuation(value):
    return re.sub("[!#?]", "", value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = []
    for value in strings:
        for func in ops:
            value = func(value)
        result.append(value)
    return result

In [84]:
clean_strings(states, clean_ops)

In [85]:
for x in map(remove_punctuation, states):
    print(x)

In [None]:
Anonymous (Lambda) Functions

Python has support for so-called anonymous or lambda functions, which are a way of writing functions consisting of a single statement, the result of
which is the return value. They are defined with the lambda keyword, which has no meaning other than “we are declaring an anonymous function”:

In [86]:
def short_function(x):
    return x * 2

equiv_anon = lambda x: x * 2

In [None]:
lambda functions are especially convenient in data analysis because, as you’ll see, 
there are many cases where data transformation functions will take functions as arguments.
It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declaration or even assigning the lambda
function to a local variable. For example

In [87]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]

ints = [4, 0, 1, 5, 6]
apply_to_list(ints, lambda x: x * 2)

In [88]:
strings = ["foo", "card", "bar", "aaaa", "abab"]

In [89]:
strings.sort(key=lambda x: len(set(x)))
strings

In [None]:
Generators

Having a consistent way to iterate over sequences, like objects in a list or lines in a file, is an important Python feature. 
This is accomplished by means of the iterator protocol, a generic way to make objects iterable. For example, iterating over a dict yields the dict keys:

In [90]:
some_dict = {"a": 1, "b": 2, "c": 3}
for key in some_dict:
    print(key)

In [None]:
An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop. Most methods expecting a list or
list-like object will also accept any iterable object. This includes built-in methods such as min, max, and sum, and type constructors like list and
tuple:

In [91]:
dict_iterator = iter(some_dict)
dict_iterator

In [92]:
list(dict_iterator)

In [None]:
A generator is a convenient way, similar to writing a normal function, to construct a new iterable object. Whereas normal functions execute and
return a single result at a time, generators return a sequence of multiple results lazily, pausing after each one until the next one is requested. 
To create a generator, use the yield keyword instead of return in a function:

In [93]:
def squares(n=10):
    print(f"Generating squares from 1 to {n ** 2}")
    for i in range(1, n + 1):
        yield i ** 2

In [94]:
gen = squares()
gen

In [95]:
for x in gen:
    print(x, end=" ")

In [None]:
Generator expresssions

Another way to make a generator is by using a generator expression. This is a generator analogue to list, dict, and set comprehensions. To create one,
enclose what would otherwise be a list comprehension within parentheses instead of brackets:

In [96]:
gen = (x ** 2 for x in range(100))
gen

In [97]:
sum(x ** 2 for x in range(100))
dict((i, i ** 2) for i in range(5))

In [None]:
itertools module

The standard library itertools module has a collection of generators for many common data algorithms. For example, groupby takes any sequence and a function, grouping consecutive elements in the sequence by
return value of the function. Here’s an example:

In [98]:
import itertools
def first_letter(x):
    return x[0]

names = ["Alan", "Adam", "Wes", "Will", "Albert", "Steven"]

for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names)) # names is a generator

In [99]:
float("1.2345")
float("something")

In [100]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x

In [101]:
attempt_float("1.2345")
attempt_float("something")

In [102]:
float((1, 2))

In [103]:
def attempt_float(x):
    try:
        return float(x)
    except ValueError:
        return x

In [104]:
attempt_float((1, 2))

In [105]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

In [106]:
path = "examples/segismundo.txt"
f = open(path, encoding="utf-8")

In [107]:
lines = [x.rstrip() for x in open(path, encoding="utf-8")]
lines

In [108]:
f.close()

In [109]:
with open(path, encoding="utf-8") as f:
    lines = [x.rstrip() for x in f]

In [110]:
f1 = open(path)
f1.read(10)
f2 = open(path, mode="rb")  # Binary mode
f2.read(10)

In [111]:
f1.tell()
f2.tell()

In [112]:
import sys
sys.getdefaultencoding()

In [113]:
f1.seek(3)
f1.read(1)
f1.tell()

In [114]:
f1.close()
f2.close()

In [115]:
path

with open("tmp.txt", mode="w") as handle:
    handle.writelines(x for x in open(path) if len(x) > 1)

with open("tmp.txt") as f:
    lines = f.readlines()

lines

In [116]:
import os
os.remove("tmp.txt")

In [117]:
with open(path) as f:
    chars = f.read(10)

chars
len(chars)

In [118]:
with open(path, mode="rb") as f:
    data = f.read(10)

data

In [119]:
data.decode("utf-8")
data[:4].decode("utf-8")

In [120]:
sink_path = "sink.txt"
with open(path) as source:
    with open(sink_path, "x", encoding="iso-8859-1") as sink:
        sink.write(source.read())

with open(sink_path, encoding="iso-8859-1") as f:
    print(f.read(10))

In [121]:
os.remove(sink_path)

In [122]:
f = open(path, encoding='utf-8')
f.read(5)
f.seek(4)
f.read(1)
f.close()