---
**Author**: Gunnvant

**Description**: Lecture on Python 101

**Audience**: Beginner to no programming background

**Pre-requisites**: Basic programming concepts such as variable, looping, if else statements

---

**TOC**:
1. Basic Python Data Structures
2. Numbers and Strings
3. Lists and Tuples
4. Sets and Dictionaries
5. File objects
6. Looping constructs: maps, list comprehension

## Numbers

In [1]:
a = 3
b = 3.0
print(a)
print(b)

3
3.0


In [2]:
print(type(a))

<class 'int'>


In [3]:
print(type(b))

<class 'float'>


In [4]:
## What will happen if we divide the two
print(a/b)

1.0


In [5]:
print(a*b)

9.0


In [6]:
print(a-b)

0.0


In [7]:
print(a+b)

6.0


## Strings

In [8]:
s1 = "This is a string"
s2 = "This is also a string"
s3 = '''This too is a string''' ## you can write multiline strings using ''' triple quotes '''

In [9]:
### Strings are iterables and can be indexed and looped through
s1[0]

'T'

In [10]:
s1[2]

'i'

In [11]:
s1[-1]

'g'

In [12]:
for s in s1:
    print(s)

T
h
i
s
 
i
s
 
a
 
s
t
r
i
n
g


## Some peculiarities of strings

In [13]:
a = 4 
print(a) ## you can change numbers at place

4


In [14]:
print(s1)
s1 = 'This is not a string'
print(s1) ## you can overwrite the whole string

This is a string
This is not a string


In [15]:
s1[0]="h" #but you can't change the strings in place

TypeError: 'str' object does not support item assignment

In [16]:
print(s1)
print(s2)

This is not a string
This is also a string


In [17]:
print(s1+s2) # you can concatenate two strings 

This is not a stringThis is also a string


In [18]:
print(s1+" "+s2)

This is not a string This is also a string


In [19]:
print(f'{s1} {s2}') #you can also use f-strings

This is not a string This is also a string


In [20]:
### Quotes behaviour
s1 = '"This has a quote"'
s2 = 'This has a quote'
s1 == s2

False

In [21]:
s1.replace('"',"")

'This has a quote'

In [22]:
s1.replace('"',"") == s2

True

## How do we know what operations can be done on strings?

Short Answer: Every thing in python is a [class](https://en.wikipedia.org/wiki/Class_(computer_programming)). And one can easily figure out what are the functions supported by objects of a class. We will discuss python classes in more detail in upcoming sessions.

In [23]:
print(dir(s1)) 
### this prints out all the methods, functions and attributes supported by python classes. Ignore the names that have this form __name__ 

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']


In [24]:
### How to use the methods, once we know what methods are supported by a particular class?
s1.lower()

'"this has a quote"'

In [25]:
s1

'"This has a quote"'

In [26]:
### Try using title method


In [27]:
### How do we find out what a method does?
## Google
## Use python documentation

?s1.title

[1;31mSignature:[0m [0ms1[0m[1;33m.[0m[0mtitle[0m[1;33m([0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Return a version of the string where each word is titlecased.

More specifically, words start with uppercased characters and all remaining
cased characters have lower case.
[1;31mType:[0m      builtin_function_or_method


In [28]:
## What does the method split does?


### Class Demonstration [Common Programming Interview problems](https://www.geeksforgeeks.org/python-uppercase-half-string/)
- input : test_str = dino
- Output : diNO
- Explanation : Latter half of string is uppercased.

- Input : test_str = apples
- Output : appLES
- Explanation : Latter half of string is uppercased.

In [29]:
s = 'dino'
## Which letters have to be capitalized?
## Can you tell the position?
s[0]

'd'

In [30]:
s[1]

'i'

In [31]:
s[2]

'n'

In [32]:
s[3]

'o'

In [33]:
## Can you come up with a logic to find the middle position of any string?
l = len(s)

In [34]:
l/2

2.0

In [35]:
## Can you think of any counter example where this may fail?
s = 'dinos'
l = len(s)

In [36]:
l/2

2.5

In [37]:
l//2

2

In [38]:
### Create logic to find the middle index
s = 'dinos'
mid_idx = len(s)//2
new_string = ''
for i in range(len(s)):
    if i<mid_idx:
        new_string = new_string+s[i]
    else:
        new_string = new_string+s[i].upper()

In [39]:
print(new_string)

diNOS


In [40]:
### We can put the complete logic in a function
def transform(s):
    mid_idx = len(s)//2
    new_string = ''
    for i in range(len(s)):
        if i<mid_idx:
            new_string = new_string+s[i]
        else:
            new_string = new_string+s[i].upper()
    return new_string

In [41]:
transform("apples")

'appLES'

### Class Excercise [source](https://www.geeksforgeeks.org/python-program-to-check-if-a-string-has-at-least-one-letter-and-one-number/) (Try to solve on your own)

- Input: welcome2ourcountry34
- Output: True

- Input: stringwithoutnum
- Output: False

- Hints: Look at the output of dir() for any string object and find out the methods which can help in finding which element is a string and which is a number


### Lists
- Lists are general buckets
- They can be used to store many types of data including lists themselves

In [42]:
l1 = [1,2,3,4,'a','b',[99,24,11]]

In [43]:
l1[0]

1

In [44]:
for i in l1:
    print(i)

1
2
3
4
a
b
[99, 24, 11]


In [45]:
l1.append(169)

In [46]:
l1

[1, 2, 3, 4, 'a', 'b', [99, 24, 11], 169]

In [47]:
l2 = [80,90,100]

In [48]:
l1+l2

[1, 2, 3, 4, 'a', 'b', [99, 24, 11], 169, 80, 90, 100]

In [49]:
l1.extend(l2)

In [50]:
l1

[1, 2, 3, 4, 'a', 'b', [99, 24, 11], 169, 80, 90, 100]

In [51]:
print(dir(l1))

['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']


In [52]:
len(l1)

11

### Using lists and strings to analyse data -  Class Case Study 1

In [1]:
amazon_review = '''
What can I say!
SIMPLY SPECTACULAR.
PUBG - smooth +90 FPS
Smashing all the bench marks by some margin (140%~~)
4 SPEAKERS are like home theatre.
Battery backup is excellent
Could have been better than 20 wts charger

I will keep u guys posted...stay tuned

'''

### Q1. Count the number of sentences in the review

Understand what new line, line feed and carraige return is: **[reading](https://www.loginradius.com/blog/async/eol-end-of-line-or-newline-characters/)**

In [54]:
amazon_review.split("\n") ## How can you store only sentences? 

['',
 'What can I say!',
 'SIMPLY SPECTACULAR.',
 'PUBG - smooth +90 FPS',
 'Smashing all the bench marks by some margin (140%~~)',
 '4 SPEAKERS are like home theatre.',
 'Battery backup is excellent',
 'Could have been better than 20 wts charger',
 '',
 'I will keep u guys posted...stay tuned',
 '',
 '']

In [55]:
## We can fill an empty list by iterating over the comment and checking for empty comments
valid_comments = []
for comment in amazon_review.split("\n"):
    if comment!="":
        valid_comments.append(comment)

In [56]:
print(valid_comments)

['What can I say!', 'SIMPLY SPECTACULAR.', 'PUBG - smooth +90 FPS', 'Smashing all the bench marks by some margin (140%~~)', '4 SPEAKERS are like home theatre.', 'Battery backup is excellent', 'Could have been better than 20 wts charger', 'I will keep u guys posted...stay tuned']


In [57]:
len(valid_comments)

8

### Q2 Find out how many words (approximately, consider "say!" as one word) are there in each sentence?

In [58]:
word_counts = []
for sent in valid_comments:
    counts = len(sent.split(" "))
    word_counts.append(counts)
word_counts

[4, 2, 5, 9, 6, 4, 8, 7]

## Class Excercise 2 

In [59]:
dinkar = '''
वर्षों तक वन में घूम-घूम,
बाधा-विघ्नों को चूम-चूम,
सह धूप-घाम, पानी-पत्थर,
पांडव आये कुछ और निखर।
सौभाग्य न सब दिन सोता है,
देखें, आगे क्या होता है।

मैत्री की राह बताने को,
सबको सुमार्ग पर लाने को,
दुर्योधन को समझाने को,
भीषण विध्वंस बचाने को,
भगवान् हस्तिनापुर आये,
पांडव का संदेशा लाये।

‘दो न्याय अगर तो आधा दो,
पर, इसमें भी यदि बाधा हो,
तो दे दो केवल पाँच ग्राम,
रक्खो अपनी धरती तमाम।
हम वहीं खुशी से खायेंगे,
परिजन पर असि न उठायेंगे!

दुर्योधन वह भी दे ना सका,
आशीष समाज की ले न सका,
उलटे, हरि को बाँधने चला,
जो था असाध्य, साधने चला।
जब नाश मनुज पर छाता है,
पहले विवेक मर जाता है।

हरि ने भीषण हुंकार किया,
अपना स्वरूप-विस्तार किया,
डगमग-डगमग दिग्गज डोले,
भगवान् कुपित होकर बोले-
‘जंजीर बढ़ा कर साध मुझे,
हाँ, हाँ दुर्योधन! बाँध मुझे।

यह देख, गगन मुझमें लय है,
यह देख, पवन मुझमें लय है,
मुझमें विलीन झंकार सकल,
मुझमें लय है संसार सकल।
अमरत्व फूलता है मुझमें,
संहार झूलता है मुझमें।
'''

### Q1. Programmatically find out how many paras/अनुच्छेद are there in the poem?

### Q2 Count the number of words in a paragraph and each sentence

## Class Excercise 3: Gender Neutral Job Descriptions

In [60]:
job_description1 = '''We are a dominant engineering firm that boasts many leading clients. We are determined to stand apart from the competition.'''
job_description2 = '''We are a community of engineers who have effective relationships with many satisfied clients. We are committed to understanding the engineer sector intimately. '''
job_description3 = '''Strong communication and influencing skills. Ability to perform individually in a competitive environment. Superior ability to satisfy customers and manage company’s association with them.'''
job_description4 = '''Proficient oral and written communications skills. Collaborates well in a team environment. Sensitive to clients’ needs, can develop warm client relationships.'''
job_description5 = '''Direct project groups to manage project progress and ensure accurate task control. Determine compliance with client’s objectives.'''
job_description6 = '''Provide general support to project team in a manner complimentary to the company. Help clients with construction activities.'''

### Q1. Write the following functions:

1. get_sentences(): This takes job descriptions as an input and returns list of sentences as output
2. get_words(): This takes a sentence as an input and returns a list of words in the sentence as an output

In [61]:
feminine_coded_words = [
    "agree",
    "affectionate",
    "child",
    "cheer",
    "collab",
    "commit",
    "communal",
    "compassion",
    "connect",
    "considerate",
    "cooperat",
    "co-operat",
    "depend",
    "emotiona",
    "empath",
    "feel",
    "flatterable",
    "gentle",
    "honest",
    "interpersonal",
    "interdependen",
    "interpersona",
    "inter-personal",
    "inter-dependen",
    "inter-persona",
    "kind",
    "kinship",
    "loyal",
    "modesty",
    "nag",
    "nurtur",
    "pleasant",
    "polite",
    "quiet",
    "respon",
    "sensitiv",
    "submissive",
    "support",
    "sympath",
    "tender",
    "together",
    "trust",
    "understand",
    "warm",
    "whin",
    "enthusias",
    "inclusive",
    "yield",
    "share",
    "sharin"
]

masculine_coded_words = [
    "active",
    "adventurous",
    "aggress",
    "ambitio",
    "analy",
    "assert",
    "athlet",
    "autonom",
    "battle",
    "boast",
    "challeng",
    "champion",
    "compet",
    "confident",
    "courag",
    "decid",
    "decision",
    "decisive",
    "defend",
    "determin",
    "domina",
    "dominant",
    "driven",
    "fearless",
    "fight",
    "force",
    "greedy",
    "head-strong",
    "headstrong",
    "hierarch",
    "hostil",
    "impulsive",
    "independen",
    "individual",
    "intellect",
    "lead",
    "logic",
    "objective",
    "opinion",
    "outspoken",
    "persist",
    "principle",
    "reckless",
    "self-confiden",
    "self-relian",
    "self-sufficien",
    "selfconfiden",
    "selfrelian",
    "selfsufficien",
    "stubborn",
    "superior",
    "unreasonab"
]


### Q2. Now write a function which can take a list of words in a sentence and match how many words start with the root form of either masculine or feminine words.

```python
def get_coded_count(list_words,coded_word_list):
    '''
    
    parameters:
    __________
    list_words: list of words in a sentence in JD
    coded_word_list: either masculine_coded_words or feminine_coded_words
    
    returns:[counts, words]
    counts: count of words that are in the sentence of the jd as well as the coded list
    words: words that are in the sentence of the jd as well as the coded list
    
    '''
    ### Your code goes here
    
    
    return [counts,words]

```

## Tuples
- Work for the most part like lists
- Their values can't be changed in-place.

In [62]:
t = (1,2,3,4,(56,78,'a'),[7,8,9])
l1 = [1,2,3,4,(56,78,'a'),[7,8,9]]

In [63]:
for i in t:
    print(i)

1
2
3
4
(56, 78, 'a')
[7, 8, 9]


In [64]:
l1[0]

1

In [65]:
l1[0]=100

In [66]:
l1

[100, 2, 3, 4, (56, 78, 'a'), [7, 8, 9]]

In [67]:
t[0]

1

In [68]:
t[0] = 100

TypeError: 'tuple' object does not support item assignment

## Sets

- They behave like mathematical sets
- Come in handy while doing set operations such as intersection (to find common items) etc

In [69]:
set_a = set(['a','b','c'])
set_b = {'b','c','d'}

In [70]:
print(dir(set_a))

['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__iand__', '__init__', '__init_subclass__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']


In [71]:
for i in set_a:
    print(i)

b
a
c


In [72]:
set_a[0]

TypeError: 'set' object is not subscriptable

In [73]:
set_a.intersection(set_b)

{'b', 'c'}

In [74]:
set_a.difference(set_b)

{'a'}

In [75]:
set_a == set_b

False

In [76]:
set_c = {'a','b','c'}

In [77]:
set_a == set_c

True

In [78]:
set("india") ## finds the unique alphabets in a string

{'a', 'd', 'i', 'n'}

## Class Excercise (Travering tuples and using set operations)

In [1]:
import pickle ## ignore this for the while
with open("../../data/ner_dataset_list","rb") as f:
    ner_data = pickle.load(f)

In [80]:
ner_data[0:2]

[[('Pierre', 'PERSON'),
  ('Vinken', 'ORGANIZATION'),
  (',', ''),
  ('61', ''),
  ('years', ''),
  ('old', ''),
  (',', ''),
  ('will', ''),
  ('join', ''),
  ('the', ''),
  ('board', ''),
  ('as', ''),
  ('a', ''),
  ('nonexecutive', ''),
  ('director', ''),
  ('Nov.', ''),
  ('29', ''),
  ('.', '')],
 [('Mr.', 'PERSON'),
  ('Vinken', 'PERSON'),
  ('is', ''),
  ('chairman', ''),
  ('of', ''),
  ('Elsevier', 'ORGANIZATION'),
  ('N.V.', ''),
  (',', ''),
  ('the', ''),
  ('Dutch', 'GPE'),
  ('publishing', ''),
  ('group', ''),
  ('.', '')]]

### Q1. Use your knowledge about lists and tuples to extract all the entries that have been labelled as GPE

In [81]:
gpes = []
for doc in ner_data:
    for ent in doc:
        if ent[1]=="GPE":
            gpes.append(ent[0].lower())            

In [82]:
gpes[0:20]

['dutch',
 'agnew',
 'british',
 'new',
 'medicine',
 'boston',
 'western',
 'u.s.',
 'u.s.',
 'medicine',
 'july',
 'average',
 'toronto',
 'finmeccanica',
 'italian',
 'wickliffe',
 'ohio',
 'u.s.',
 'legislation',
 'u.s.']

### Q2. Find the number of unique GPES in the data. Use the idea of sets. Also find out the the number of duplicated entries.

In [83]:
len(set(gpes))

556

In [84]:
len(gpes)-len(set(gpes))

1339

## Dictionaries

- Key-value pairs
- Indices have no order

In [85]:
d = {'name':'a','age':29,'prev_companies':['abc','def']}

In [86]:
d[0]

KeyError: 0

In [87]:
d['name']

'a'

In [88]:
d['prev_companies']

['abc', 'def']

In [89]:
d['prev_companies'][-1]

'def'

In [90]:
for i in d:
    print(i)

name
age
prev_companies


In [91]:
for i in d:
    print(d[i])

a
29
['abc', 'def']


In [92]:
d['new_key'] = 'value'

In [93]:
d

{'name': 'a', 'age': 29, 'prev_companies': ['abc', 'def'], 'new_key': 'value'}

In [94]:
'value' in d

False

In [95]:
'new_key' in d

True

## Class Excercise ([Word Frequency Counter](http://www.thehypertexts.com/Mirza%20Ghalib%20English%20Translations%20by%20Michael%20R.%20Burch.htm))

In [96]:
ghalib = [
'''
It’s only my heart, not unfeeling stone,
so why be dismayed when it throbs with pain?
It was made to suffer ten thousand darts;
why let one more torment impede us?
''',

'''
The miracle of your absence
is that I found myself endlessly searching for you.
''',

'''
On the subject of mystic philosophy, Ghalib,
your words might have struck us as deeply profound
and we might have pronounced you a saint ...
Yes, if only we hadn't found
you drunk
as a skunk!
''',
    
'''
Not the blossomings of songs nor the adornments of music:
I am the voice of my own heart breaking.
You toy with your long, dark curls
while I remain captive to my dark, pensive thoughts.
We congratulate ourselves that we two are different:
that this weakness has not burdened us both with inchoate grief.
Now you are here, and I find myself bowing—
as if sadness is a blessing, and longing a sacrament.
I am a fragment of sound rebounding;
you are the walls impounding my echoes.
''',
    
'''
All your life, O Ghalib,
You kept repeating the same mistake:
Your face was dirty
But you were obsessed with cleaning the mirror!
'''
]

In [97]:
counter = {}
for couplet in ghalib:
    for token in couplet.replace("\n"," ").split(" "):
        if token.lower() in counter:
            counter[token.lower()] = counter[token.lower()]+1
        else:
            counter[token.lower()]=1      

In [98]:
counter

{'': 10,
 'it’s': 1,
 'only': 2,
 'my': 4,
 'heart,': 1,
 'not': 3,
 'unfeeling': 1,
 'stone,': 1,
 'so': 1,
 'why': 2,
 'be': 1,
 'dismayed': 1,
 'when': 1,
 'it': 2,
 'throbs': 1,
 'with': 4,
 'pain?': 1,
 'was': 2,
 'made': 1,
 'to': 2,
 'suffer': 1,
 'ten': 1,
 'thousand': 1,
 'darts;': 1,
 'let': 1,
 'one': 1,
 'more': 1,
 'torment': 1,
 'impede': 1,
 'us?': 1,
 'the': 8,
 'miracle': 1,
 'of': 6,
 'your': 5,
 'absence': 1,
 'is': 2,
 'that': 3,
 'i': 5,
 'found': 2,
 'myself': 2,
 'endlessly': 1,
 'searching': 1,
 'for': 1,
 'you.': 1,
 'on': 1,
 'subject': 1,
 'mystic': 1,
 'philosophy,': 1,
 'ghalib,': 2,
 'words': 1,
 'might': 2,
 'have': 2,
 'struck': 1,
 'us': 2,
 'as': 3,
 'deeply': 1,
 'profound': 1,
 'and': 3,
 'we': 4,
 'pronounced': 1,
 'you': 7,
 'a': 5,
 'saint': 1,
 '...': 1,
 'yes,': 1,
 'if': 2,
 "hadn't": 1,
 'drunk': 1,
 'skunk!': 1,
 'blossomings': 1,
 'songs': 1,
 'nor': 1,
 'adornments': 1,
 'music:': 1,
 'am': 2,
 'voice': 1,
 'own': 1,
 'heart': 1,
 'breaking

## Class Excercise (Organize the text)

### Q1 Use the ner_data and and organize the data in the following manner in a dictionary:

```json
{
    "PERSON":['Pierre'...],
    "ORGANIZATION":['Boeing'.....],
    "GPE":['DUTCH'.....],
    "LOCATION":['Missisipi'.....]
}

```
You can ignore to handle the cases where a Named Entity is contigously many words long, e.g. for New York, you can treat New and York as separate words.

## Q2. Now modify the existing dictionary so that duplicate entries are removed

## [File objects](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)

- They help in disk based i/o. 
- Help you read text as well as binary files.

In [2]:
f = open("../../data/file_data.csv","r")
data_string = f.read()
f.close()
f = open("../../data/file_data.csv","r")
data_lines = f.readlines()
f.close()

In [100]:
data_string[0:100]

',sensor_id,time,incoming,outgoing,range,date,hour,minute,total,location_name\n0,52,2021-06-17 07:07:1'

In [101]:
data_lines[0:2]

[',sensor_id,time,incoming,outgoing,range,date,hour,minute,total,location_name\n',
 '0,52,2021-06-17 07:07:11.937082+00:00,0,2,1min,2021-06-17,7,7,2,reitan_7eleven_carlberner\n']

In [102]:
headers = data_lines[0].replace("\n","").split(",")

In [103]:
headers

['',
 'sensor_id',
 'time',
 'incoming',
 'outgoing',
 'range',
 'date',
 'hour',
 'minute',
 'total',
 'location_name']

In [104]:
sensor_id = []
time = [] 
incoming = []
outgoing = []
rnge = []
date = [] 
hour = []
minute = [] 
total = [] 
location_name = []

for row in data_lines[1:]:
    values = row.replace("\n","").split(",")[1:]
    sensor_id.append(values[0])
    time.append(values[1])
    incoming.append(values[2])
    outgoing.append(values[3])
    rnge.append(values[4])
    date.append(values[5])
    hour.append(values[6])
    minute.append(values[7])
    total.append(values[8])
    location_name.append(values[9])

In [105]:
sensor_id[0:5]

['52', '52', '52', '52', '52']

In [106]:
total[0:5]

['2', '1', '1', '1', '1']

In [107]:
sensor_id = []
time = [] 
incoming = []
outgoing = []
rnge = []
date = [] 
hour = []
minute = [] 
total = [] 
location_name = []

for row in data_lines[1:]:
    values = row.replace("\n","").split(",")[1:]
    sensor_id.append(int(values[0]))
    time.append(values[1])
    incoming.append(int(values[2]))
    outgoing.append(int(values[3]))
    rnge.append(values[4])
    date.append(values[5])
    hour.append(int(values[6]))
    minute.append(int(values[7]))
    total.append(int(values[8]))
    location_name.append(values[9])

In [108]:
total[0:3]

[2, 1, 1]

### Q1 Find out the average of the column total from 11 PM to 11:45 PM on 2021-06-18.

In [109]:
cnt = 0
for values in zip(date,hour,minute,total):
    print(values)
    cnt+=1
    if cnt==10:
        break

('2021-06-17', 7, 7, 2)
('2021-06-17', 7, 7, 1)
('2021-06-17', 7, 9, 1)
('2021-06-17', 7, 10, 1)
('2021-06-17', 7, 11, 1)
('2021-06-17', 7, 14, 1)
('2021-06-17', 7, 16, 1)
('2021-06-17', 7, 17, 1)
('2021-06-17', 7, 17, 1)
('2021-06-17', 7, 19, 1)


In [110]:
cnt = 0
totals = []
for values in zip(date,hour,minute,total):
    if values[0]=="2021-06-18" and values[1]==23 and values[2]<=45:
        cnt+=1
        totals.append(values[3])

In [111]:
cnt

5

In [112]:
totals

[1, 1, 1, 1, 1]

In [113]:
sum(totals)/cnt

1.0

### Q2. Count the number of times the incoming is more than outgoing in the whole file.

In [114]:
cnt = 0
for values in zip(incoming,outgoing):
    if values[0]>values[1]:
        cnt+=1
print(cnt)

324


## Class Assignment 

- Use the data tweets_assignment.txt
- Find out how many tweets were tweeted on each day, i.e. how many tweets were made on 26th December, how many on 25th December and so on.

## Context Manager: open()

- We need to close the file connection every time we open a file
- Context managers do some of the tasks implicitly for us
- While using `open()` with a context manager the file connection automatically closes
- We use `with` statement to use a context manager while working with file objects

In [3]:
with open("../../data/file_data.csv","r") as f:
    data_list = f.readlines()
data_list[0:10]

[',sensor_id,time,incoming,outgoing,range,date,hour,minute,total,location_name\n',
 '0,52,2021-06-17 07:07:11.937082+00:00,0,2,1min,2021-06-17,7,7,2,reitan_7eleven_carlberner\n',
 '1,52,2021-06-17 07:07:51.166361+00:00,1,0,1min,2021-06-17,7,7,1,reitan_7eleven_carlberner\n',
 '2,52,2021-06-17 07:09:48.861997+00:00,0,1,1min,2021-06-17,7,9,1,reitan_7eleven_carlberner\n',
 '3,52,2021-06-17 07:10:40.197648+00:00,1,0,1min,2021-06-17,7,10,1,reitan_7eleven_carlberner\n',
 '4,52,2021-06-17 07:11:49.664105+00:00,0,1,1min,2021-06-17,7,11,1,reitan_7eleven_carlberner\n',
 '5,52,2021-06-17 07:14:20.731722+00:00,1,0,1min,2021-06-17,7,14,1,reitan_7eleven_carlberner\n',
 '6,52,2021-06-17 07:16:27.575491+00:00,0,1,1min,2021-06-17,7,16,1,reitan_7eleven_carlberner\n',
 '7,52,2021-06-17 07:17:21.919449+00:00,1,0,1min,2021-06-17,7,17,1,reitan_7eleven_carlberner\n',
 '8,52,2021-06-17 07:17:52.122774+00:00,0,1,1min,2021-06-17,7,17,1,reitan_7eleven_carlberner\n']

## Alternate Looping constructs

In [116]:
job_description1 = '''We are a dominant engineering firm that boasts many leading clients. We are determined to stand apart from the competition.'''
job_description2 = '''We are a community of engineers who have effective relationships with many satisfied clients. We are committed to understanding the engineer sector intimately. '''
job_description3 = '''Strong communication and influencing skills. Ability to perform individually in a competitive environment. Superior ability to satisfy customers and manage company’s association with them.'''
job_description4 = '''Proficient oral and written communications skills. Collaborates well in a team environment. Sensitive to clients’ needs, can develop warm client relationships.'''
job_description5 = '''Direct project groups to manage project progress and ensure accurate task control. Determine compliance with client’s objectives.'''
job_description6 = '''Provide general support to project team in a manner complimentary to the company. Help clients with construction activities.'''

## List Comprehension
- Prefer when running a simple for loop
- Inspired from set notation in mathematics
- You can decide if you want to use this construct based on your comfort level
- You might have to read other people's code where this construct may have been used

Lets impliment the sentence tokenization logic using list-comprehension

In [117]:
[sent for sent in job_description1.split(".")]

['We are a dominant engineering firm that boasts many leading clients',
 ' We are determined to stand apart from the competition',
 '']

In [118]:
[sent for sent in job_description1.split(".") if sent!="" and sent!=" "]
## produce a list of sent given the iterable and the condition

['We are a dominant engineering firm that boasts many leading clients',
 ' We are determined to stand apart from the competition']

## Maps

- These can be used to refactor a for loop in which a lot is going on
- The idea can be best described by following gif

![](maps.gif)


Lets impliment the logic we discussed above using the notion of map

In [119]:
def square(x):
    return x**2
List = [1,2,3,4,5,6,7,8,9]

In [120]:
list(map(square,List))

[1, 4, 9, 16, 25, 36, 49, 64, 81]