# Built in Data Structures, Functions and File handling in Python

## Data Structures and Sequence
* Tuple {tup = (1,2,3)}
    - UNPACKING TUPLES {tup = (4, 5, 6); a, b, c = tup}
    - TUPLE METHODS
        + count { a = (1, 2, 2, 2, 3, 4, 2); a.count(2)
* List {mylist = [1,2,3,4]}
    - ADDING AND REMOVING ELEMENTS { b_list.append('dwarf'); b_list.insert(1, 'red'); 
        + insert is computationally expensive compared with append
    - The inverse operation to insert is pop { b_list.pop(2)}
    - Elements can be removed by value with remove { b_list.remove('foo')}
    - Check if a list contains a value using the in keyword: { 'dwarf' in b_list}
    - CONCATENATING AND COMBINING LISTS { [4, None, 'foo'] + [7, 8, (2, 3)]}
    - you can append multiple elements to it using the extend method: { x = [4, None, 'foo']; x.extend([7, 8, (2, 3)])}
    - Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable
    - SORTING {a = [7, 2, 5, 1, 3];a.sort()}
    - Sort has a few options that will occasionally come in handy. One is the ability to pass a secondary sort key—that is, a function that produces a value to use to sort the objects { b = ['saw', 'small', 'He', 'foxes', 'six']; b.sort(key=len)}
    - BINARY SEARCH AND MAINTAINING A SORTED LIST { import bisect; c = [1, 2, 2, 2, 3, 4, 7]; bisect.bisect(c, 2)}
    - bisect.insort actually inserts the element into that location: { bisect.insort(c, 6)}
    - SLICING
        + start:stop passed to the indexing operator [], start index is included, the stop index is not included
        + A step can also be used after a second colon to, say, take every other element
        + A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:
* Built-in Sequence Functions
    - ENUMERATE: Python has a built-in function, enumerate, which returns a sequence of (i, value) tuples: { for i, value in enumerate(collection):; #do something with value}
    - SORTED: The sorted function returns a new sorted list from the elements of any sequence:
    - ZIP: zip “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples {seq1 = ['foo', 'bar', 'baz']; seq2 = ['one', 'two', 'three']; zipped = zip(seq1, seq2)}
        + A very common use of zip is simultaneously iterating over multiple sequences, possibly also combined with enumerate: {for i, (a, b) in enumerate(zip(seq1, seq2)):;print('{0}: {1}, {2}'.format(i, a, b))}
    - REVERSED: reversed iterates over the elements of a sequence in reverse order:
* Dict: dict is likely the most important built-in Python data structure. A more common name for it is hash map or associative array ( empty_dict = {}; d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]} )
    - You can delete values either using the del keyword or the pop method ( d1 = {5:'Some other','a' : 'some value', 'b' : [1, 2, 3, 4]}; del d1[5] or ret = d1.pop(5) )
    - You can merge one dict into another using the update method:( d1.update({'b' : 'foo', 'c' : 12}) )
    - CREATING DICTS FROM SEQUENCES { mapping = dict(zip(range(5), reversed(range(5)))) }
    - DEFAULT VALUES
        + the dict methods get and pop can take a default value to be returned {value = some_dict.get(key, default_value)}
        + The built-in collections module has a useful class, defaultdict { from collections import defaultdict; by_letter = defaultdict(list); 
    - VALID DICT KEY TYPES
        + The keys generally have to be immutable objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too) - term here is hashability {hash('string'); hash((1, 2, [2, 3])) # fails because lists are mutable}
* set : A set is an unordered collection of unique elements {set([2, 2, 2, 1, 3, 3])}
    - Sets support mathematical set operations like union, intersection, difference, and symmetric difference. Consider these two example sets { a = {1, 2, 3, 4, 5}; b = {3, 4, 5, 6, 7, 8}; a.union(b); a | b; a.intersection(b); a & b;}
* List, Set, and Dict Comprehensions
    - List comprehension: [expr for val in collection if condition] { strings = ['a', 'as', 'bat', 'car', 'dove', 'python']; x.upper() for x in strings if len(x) > 2}
    - Dict comprehension: dict_comp = {key-expr : value-expr for value in collection if condition}
    - Set comprehension: set_comp = {expr for value in collection if condition}
    - NESTED LIST COMPREHENSIONS {result = [name for names in all_data for name in names if name.count('e') >= 2] }
* Functions - Functions are declared with the def keyword and returned from with the return keyword:
    def my_function(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)
    - There is no issue with having multiple return statements. If Python reaches the end of a function without encountering a return statement, None is returned automatically.

    - Each function can have positional arguments and keyword arguments. Keyword arguments are most commonly used to specify default values or optional arguments
    - The main restriction on function arguments is that the keyword arguments must follow the positional arguments (if any).
    - Namespaces, Scope, and Local Functions
        + Assigning variables outside of the function’s scope is possible, but those variables must be declared as global via the global keyword:
        def bind_a_variable(): 
            global a
            a = []
        bind_a_variable()
        + generally discourage use of the global keyword
    - Returning Multiple Values
        def f():
            a = 5
            b = 6
            c = 7
            return a, b, c

        a, b, c = f()
    - Functions Are Objects { clean_ops = [str.title, str.strip]}
    - MAP function: Used to map function with object { for x in map(str.title, value):;print(x)}
    - Anonymos(Lambda) functions: way of writing functions consisting of a single statement, the result of which is the return value
        def short_function(x):
            retrun x*2
        equiv_anon = lambda x:x*2
    - 


In [3]:
tup = 1,3,5,6

In [2]:
tup

(1, 3, 5, 6)

In [4]:
tup = ('foo',[1,2,3],True)

In [5]:
tup[1].append(4)

In [6]:
tup

('foo', [1, 2, 3, 4], True)

In [7]:
tup[0]='bar'

TypeError: 'tuple' object does not support item assignment

In [8]:
tup[0].append('bar')

AttributeError: 'str' object has no attribute 'append'

In [9]:
type(tup)

tuple

In [10]:
type(tup[0])

str

In [12]:
for tups in tup:
    print(type(tups))

<class 'str'>
<class 'list'>
<class 'bool'>


In [13]:
a,b,c = tup

In [14]:
b.append(5)

In [15]:
b

[1, 2, 3, 4, 5]

In [16]:
c

True

In [17]:
c = Fales

NameError: name 'Fales' is not defined

In [19]:
c = False

In [20]:
tup

('foo', [1, 2, 3, 4, 5], True)

In [21]:
tup = a,b,c

In [22]:
tup

('foo', [1, 2, 3, 4, 5], False)

In [23]:
seq = [[1,3,4],[10,30,40],[100,300,400]]

In [24]:
for a,b,c in seq:
    print('{0}, {1}, {2}'.format(a,b,c))

1, 3, 4
10, 30, 40
100, 300, 400


In [28]:
import random
seq = [random.randint(1,100) for x in range(10)]

In [29]:
seq

[25, 32, 92, 73, 95, 52, 15, 39, 41, 95]

In [30]:
a,b,*rest = seq

In [31]:
a,b,*rest

(25, 32, 92, 73, 95, 52, 15, 39, 41, 95)

In [32]:
*rest

SyntaxError: can't use starred expression here (<ipython-input-32-3ef1f4fd22eb>, line 1)

In [33]:
rest

[92, 73, 95, 52, 15, 39, 41, 95]

## Tupel functions/Methods

In [34]:
seq.count(95)

2

In [35]:
seq.append(10)

In [36]:
seq

[25, 32, 92, 73, 95, 52, 15, 39, 41, 95, 10]

In [37]:
seq.length()

AttributeError: 'list' object has no attribute 'length'

In [38]:
seq.len()

AttributeError: 'list' object has no attribute 'len'

In [39]:
len(seq)

11

In [45]:
seq = [random.randint(1,100) for x in range(10)]

In [43]:
[random.randint(1,100) for x in range(10)]

[4, 14, 26, 46, 14, 81, 17, 22, 100, 85]

In [46]:
seq

[50, 44, 81, 98, 22, 4, 24, 69, 86, 83]

In [47]:
seq.sort

<function list.sort>

In [48]:
seq.sort()

In [49]:
seq

[4, 22, 24, 44, 50, 69, 81, 83, 86, 98]

In [50]:
import bisect

In [51]:
bisect.bisect(seq,25)

3

In [53]:
bisect.insort(seq,25)

In [54]:
seq

[4, 22, 24, 25, 44, 50, 69, 81, 83, 86, 98]

In [55]:
seq = [random.randint(1,100) for x in range(10)]

In [56]:
seq

[23, 33, 38, 50, 31, 23, 20, 25, 7, 57]

In [59]:
bisect.bisect(seq.sort(), 50)

TypeError: object of type 'NoneType' has no len()

In [60]:
seq.sort()

In [62]:
bisect.bisect(seq, 50)

9

In [63]:
seq

[7, 20, 23, 23, 25, 31, 33, 38, 50, 57]

In [65]:
bisect.insort(seq,40)

In [74]:
seq

[7, 20, 23, 23, 25, 31, 33, 38, 40, 40, 50, 57]

In [75]:
seq[::2]

[7, 23, 25, 33, 40, 50]

In [76]:
seq[::-1]

[57, 50, 40, 40, 38, 33, 31, 25, 23, 23, 20, 7]

In [77]:
seq[::1]

[7, 20, 23, 23, 25, 31, 33, 38, 40, 40, 50, 57]

In [78]:
myfamily = ['Chirag','Parul','Vidhi','Janvi','Rameshchandra','Madhuben']

In [81]:
mapping = {}
for i, v in enumerate(myfamily):
    print('Index of name {1} is {0}'.format(i,v))
    mapping[i]=v

Index of name Chirag is 0
Index of name Parul is 1
Index of name Vidhi is 2
Index of name Janvi is 3
Index of name Rameshchandra is 4
Index of name Madhuben is 5


In [92]:
mapping

{1: 'Parul', 2: 'Vidhi', 3: 'Janvi', 4: 'Rameshchandra', 5: 'Madhuben'}

In [83]:
mapping.del(0)

SyntaxError: invalid syntax (<ipython-input-83-bb1f9a66da50>, line 1)

In [91]:
del mapping[0]

In [109]:
mydict = dict(zip([0,1,2,4],[4,2,1,0]))

In [104]:
list(zip(range(5)))

[(0,), (1,), (2,), (3,), (4,)]

In [110]:
mydict

{0: 4, 1: 2, 2: 1, 4: 0}

In [111]:
zip([0,1,2,4],[4,2,1,0])

<zip at 0x20b637f3a08>

In [112]:
list(zip([0,1,2,4],[4,2,1,0]))

[(0, 4), (1, 2), (2, 1), (4, 0)]

# Built-in Sequence Functions

### Enumerate - It is used to get the index and value of all iterative data or sequences

In [119]:
seq = [random.randint(1,100) for x in range(10)]

In [120]:
seq
sorted(seq)
seq

[30, 100, 45, 82, 86, 61, 3, 42, 21, 31]

In [121]:
sorted(seq)

[3, 21, 30, 31, 42, 45, 61, 82, 86, 100]

In [122]:
seq2 = [random.randint(100,200) for x in range(10)]

In [123]:
seq

[30, 100, 45, 82, 86, 61, 3, 42, 21, 31]

In [125]:
seq2

[187, 162, 148, 101, 181, 199, 200, 109, 159, 154]

In [132]:
for i, (a, b) in enumerate(zip(sorted(seq),sorted(seq2))):
    print('seq number {0}, seq2 number {1}'.format(a,b))

seq number 3, seq2 number 101
seq number 21, seq2 number 109
seq number 30, seq2 number 148
seq number 31, seq2 number 154
seq number 42, seq2 number 159
seq number 45, seq2 number 162
seq number 61, seq2 number 181
seq number 82, seq2 number 187
seq number 86, seq2 number 199
seq number 100, seq2 number 200


In [133]:
student = {'age': 21, 'name':'Janvi', 'grade':4}

In [134]:
student['age']

21

In [136]:
student.update({'result':'pass'})

In [137]:
student

{'age': 21, 'grade': 4, 'name': 'Janvi', 'result': 'pass'}

In [138]:
student[2]

KeyError: 2

In [139]:
student[0]

KeyError: 0

In [140]:
student['classroom']=201

In [141]:
student

{'age': 21, 'classroom': 201, 'grade': 4, 'name': 'Janvi', 'result': 'pass'}

In [142]:
del student['result']

In [143]:
student

{'age': 21, 'classroom': 201, 'grade': 4, 'name': 'Janvi'}

In [145]:
student.get('age of student',50)

50

In [146]:
student.get('age',50)

21

In [147]:
words = ['apple','age', 'banana','big','cat', 'cannon','chirag','parul','patel']

In [150]:
by_letters = {}
for word in words:
    firstletter = word[0]
    if firstletter not in by_letters:
        by_letters[firstletter] = [word]
    else:
        by_letters[firstletter].append(word)

In [151]:
by_letters

{'a': ['apple', 'age'],
 'b': ['banana', 'big'],
 'c': ['cat', 'cannon', 'chirag'],
 'p': ['parul', 'patel']}

In [152]:
by_letters_d = {}
for word in words:
    letter=word[0]
    by_letters_d.setdefault(letter,[]).append(word)

In [153]:
by_letters_d

{'a': ['apple', 'age'],
 'b': ['banana', 'big'],
 'c': ['cat', 'cannon', 'chirag'],
 'p': ['parul', 'patel']}

In [176]:
def add_word_by_letter(letter):
    return [word for word in words if word[0] == letter]

In [190]:
from collections import defaultdict
#by_letters_c = defaultdict(list)
letter_set = {word[0] for word in words}
by_letters_c = {word[0]:add_word_by_letter(word[0]) for word in words}

In [187]:
by_letters_c

{'a': ['apple', 'age'],
 'b': ['banana', 'big'],
 'c': ['cat', 'cannon', 'chirag'],
 'p': ['parul', 'patel']}

In [167]:
words

['apple', 'age', 'banana', 'big', 'cat', 'cannon', 'chirag', 'parul', 'patel']

In [191]:
letter_set

{'a', 'b', 'c', 'p'}

In [192]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [194]:
list(map(len,strings))

[1, 2, 3, 3, 4, 6]

In [195]:
words

['apple', 'age', 'banana', 'big', 'cat', 'cannon', 'chirag', 'parul', 'patel']

In [200]:
enough_n = [word for word in words if word.count('n') >=2]

In [201]:
enough_n

['banana', 'cannon']

### nested comprehension

In [202]:
nested_words = [words, ['some','other','words']]

In [203]:
nested_words

[['apple',
  'age',
  'banana',
  'big',
  'cat',
  'cannon',
  'chirag',
  'parul',
  'patel'],
 ['some', 'other', 'words']]

In [205]:
word_with_e = [word for words1 in nested_words for word in words1 if word.count('e')>=1]

In [206]:
word_with_e

['apple', 'age', 'patel', 'some', 'other']

In [2]:
def my_function(x, y, z=5):
    if (x >  y):
        return x - y + z
    else:
        return y - x + z
my_function(5, 10)
    

10

In [3]:
my_function(z=100, x= 10, y=15)

105

In [5]:
import re
all_contries = ['canada','#usa','india?','west indies','ausTralia', 'JAPAN']

def clean_contry_names(contires):
    result = []
    for contry in contires:
        contry = contry.strip()
        contry = re.sub('[!$?#]','',contry)
        contry = contry.title()
        result.append(contry)
    return result

clean_contry_names(all_contries)

['Canada', 'Usa', 'India', 'West Indies', 'Australia', 'Japan']

In [8]:
def remove_punctuation(value):
    return re.sub('[!#$?]','',value)

clean_ops = [str.strip, remove_punctuation,str.title]

def clean_strings(strings,ops):
    result = []
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

clean_strings(all_contries,clean_ops)

['Canada', 'Usa', 'India', 'West Indies', 'Australia', 'Japan']

In [12]:
for x in map(str.title,map(remove_punctuation, all_contries)):
    print(x)

Canada
Usa
India
West Indies
Australia
Japan
