That first line of code (%matplotlib inline) isn’t actually a Python command, but uses something called a line magic to instruct Jupyter to capture Matplotlib plots and render them in the cell output. We'll talk a bit more about line magics later, and they're also covered in our advanced [Jupyter Notebooks tutorial](https://www.dataquest.io/blog/advanced-jupyter-notebooks-tutorial/).

In [2]:
from collections import defaultdict
from collections import Counter

In [3]:
print("hello world")

hello world


**Function parameters can also be given default arguments, which only need to be specified when you want a value other than the default.** 

In [5]:
def my_print(message = "my default message"): 
    print(message) 
my_print("hello")   # prints 'hello        
# prints 'my default message'
my_print()

hello
my default message


*You can create multiline strings using three double quotes:*


In [7]:
multi_line_string="""This is the first line 
this is the second line
and this is the final and the last and the only one of a kind third line"""

In [8]:
print(multi_line_string)

This is the first line 
this is the second line
and this is the final and the last and the only one of a kind third line


In [9]:
first_name="Himan"
last_name="Pradhaan"
full_name=first_name+" "+last_name
print(full_name)

Himan Pradhaan


The above was the classical way - string addition but the below is the f-string way which is much less unwieldy:

In [11]:
full_name2=f"My name is, {first_name} {last_name}"
print(full_name2)

My name is, Himan Pradhaan


In [12]:
try:
    print(0/0)
except ZeroDivisionError:
    print("Cannot divide by zero")

Cannot divide by zero


In [13]:
x=[0,1,2,3,4,5,6,7,8,9]
print(x[:3])
print(x[3:])
print(x[-3:])
print(x[:-3])

[0, 1, 2]
[3, 4, 5, 6, 7, 8, 9]
[7, 8, 9]
[0, 1, 2, 3, 4, 5, 6]


In [14]:
x, y=[1,2]
print("x is:",x,"and y is:",y)

x is: 1 and y is: 2


In [15]:
document=["orange","apple","apple", 
          "banana", "apple","banana","apple", 
          "orange", "banana", "apple","banana"]
word_counts = {}
for word in document:
    if word in word_counts:
        word_counts[word]+=1
    else:
        word_counts[word]=1

print(word_counts)

{'orange': 2, 'apple': 5, 'banana': 4}


In [16]:
word_counts={}
for word in document:
    try:
        word_counts[word]+=1
    except KeyError:
        word_counts[word]=1


print(word_counts)

{'orange': 2, 'apple': 5, 'banana': 4}


In [17]:
word_counts={}
for word in document:
    previous_count=word_counts.get(word,0)
    word_counts[word]=previous_count+1
print(word_counts)

{'orange': 2, 'apple': 5, 'banana': 4}


When you use `defaultdict`, it includes additional information about the default factory function.

- Standard Dictionary `(dict)`: When you print a standard dictionary, it only shows the key-value pairs.
- `
defaultdict`: When you print ` defaultdict`t, it shows the key-value pairs along with the default factory function. This is becaus` defaultdict`t is a subclass o ` dict`t and includes additional information about the default value mechanism.

In [19]:
word_counts=defaultdict(int)
for word in document:
    word_counts[word]+=1
print(word_counts)

defaultdict(<class 'int'>, {'orange': 2, 'apple': 5, 'banana': 4})


In [20]:
dd_list = defaultdict(list)
dd_list[2].append(1)
print(dd_list)

defaultdict(<class 'list'>, {2: [1]})


In [21]:
c = Counter([0, 1, 2, 0])
word_counts=Counter(document)
print(word_counts)

Counter({'apple': 5, 'banana': 4, 'orange': 2})


Printing the 2 most common words and their counts

In [23]:
for word, count in word_counts.most_common(2):
    print(word, count)

apple 5
banana 4


In [24]:
pairs = word_counts.most_common(3)

# Custom function to simulate the for loop and unpacking process
def custom_for_loop(pairs):
    for number, letter in pairs:
        print(number, letter)

# Call the custom function
custom_for_loop(pairs)

apple 5
banana 4
orange 2


In [25]:
s={}
print(type(s))
print(s)
s=set()
print(type(s))
s.add(1)
print(s)

<class 'dict'>
{}
<class 'set'>
{1}


Sets are good for mainly 2 reason(for our use of course).
The first one is that ```in``` is a very fast operation on sets. If we have a large collection of items that we want to use for a membership test, a set is more appropriate than a list:

In [None]:
stopwords_list=["a", "an", "at"] + hundreds_of_other_words + ["yet", "you"]
"zip" in stopwords_list     # False, but have to check every element
stopwords_set = set(stopwords_list) 
"zip" in stopwords_set      # very fast to check

## Lists vs Sets
- Sets: Sets in Python are implemented using hash tables. This allows for average-case constant time complexity, O(1), for membership tests. When you check if an item is in a set, Python computes the hash of the item and directly checks the corresponding position in the hash table.- 
Lists: Lists in Python are implemented as dynamic arrays. This means that to check if an item is in a list, Python has to iterate through each element of the list until it finds a match. This results in an average-case time complexity of O(n) for membership tests, where n is the number of elements in the list.

#### Let’s compare the performance of membership tests in sets and lists:

In [None]:
import time

# Create a large list and set
large_list = list(range(100000000))
large_set = set(range(100000000))

# Test membership in list
start_time = time.time()
99999999 in large_list
end_time = time.time()
print("List membership test took:", end_time - start_time, "seconds")

# Test membership in set
start_time = time.time()
99999999 in large_set
end_time = time.time()
print("Set membership test took:", end_time - start_time, "seconds")

#### Sets can only contain distinct elements

In [None]:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list)
item_set = set(item_list)
num_distinct_items = len(item_set)
distinct_item_list = list(item_set)

## Hash Tables

In [None]:
class HashTable:
    def __init__(self, size):
        self.size = size
        self.table = [[] for _ in range(size)]

    def hash_function(self, key):
        a = hash(key) # a looks something like this 3159519891127782799
        return a % self.size

    def insert(self, key, value):
        index = self.hash_function(key)
        for kvp in self.table[index]:
            if kvp[0] == key:
                kvp[1] = value
                return
        self.table[index].append([key, value])

    def search(self, key):
        index = self.hash_function(key)
        for kvp in self.table[index]:
            if kvp[0] == key:
                return kvp[1]
        return None

    def delete(self, key):
        index = self.hash_function(key)
        for i, kvp in enumerate(self.table[index]):
            if kvp[0] == key:
                del self.table[index][i]
                return

# Example usage
hash_table = HashTable(10)
hash_table.insert("apple", 1)
print(hash_table)
hash_table.insert("banana", 2)
print(hash_table.search("apple"))  # Output: 1
hash_table.delete("apple")
print(hash_table.search("apple"))  # Output: None

In [36]:
x = 5
assert x > 0, "x should be positive"
print("x is positive")

x is positive


### Object Oriented Programming

In [64]:
class CountingClicker: 
    """A class can/should have a docstring, just like a function"""
    def __init__(self, count=0):
        self.count=count

    def __repr__(self): 
        return f"CountingClicker(count={self.count})"

    
    # what is a public API of a class?
    def click(self, num_times = 1): 
        """Click the clicker some number of times.""" 
        self.count += num_times 
 
    def read(self): 
        return self.count 
 
    def reset(self): 
        self.count = 0

Writting tests like these help us be confident that our code is working the way we expect it to be (or the way it's designed to), and that it remains doing so whenever we make changes to it.

In [83]:
clicker = CountingClicker(0)
assert clicker.read() == 0, "clicker should start with count 0"
clicker.click()
clicker.click()
assert clicker.read()==2,  "after two clicks, clicker should have count 2"
clicker.reset()
assert clicker.read() ==0 , "after reset, clicker should be back to 0"

In [1]:
def generate_range(n):
    i=0
    while i<n:
        yield i #every call to yeild produces a value of the generator
        i+=1

for i in generate_range(10):
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3
i: 4
i: 5
i: 6
i: 7
i: 8
i: 9
