# Python Random Tips

### Author: Sreekiran A R
*MSc Artificial Intelligence & Machine Learning, University of Birmingham* -
[LinkedIn](https://www.linkedin.com/in/sreekiran-a-r-3b05a1116/)

## Tips from solving HackerRank


### 1. Remove the duplicates from a string preserving the first occurance

In [3]:
foo = "mppmt"
result = "".join(dict.fromkeys(foo))
### dict will preserve the order of occurance since Python 3.7 
print(result)

mpt


### 2. Find element in a list that is occuring only once using set operations

In [7]:
rooms = [1,2,2,3,3,4,4,5]
### create two sets, one for adding when a new element comes, one for adding when an already existing element comes
seta = set()
setb = set()
for i in rooms:
    if i in seta:
        setb.add(i)
    else:
        seta.add(i)
### finally take difference to get only element occuring once
print(*seta-setb)

1 5


### 3. Math tip to get the pattern 1,11,111,1111,.....

In [9]:
def pattern(n):
    for i in range(1,n+1):
        print(10**(i)//9)
pattern(5)

1
11
111
1111
11111


### 4. itertools groupby

In [10]:
from itertools import groupby
print([k for k, g in groupby('AAAABBBCCDAABBB')])
print([list(g) for k, g in groupby('AAAABBBCCD')])

['A', 'B', 'C', 'D', 'A', 'B']
[['A', 'A', 'A', 'A'], ['B', 'B', 'B'], ['C', 'C'], ['D']]


### 5. namedtuple, (tuples with index names)

In [3]:
from collections import namedtuple

car = namedtuple('CAR','PRICE MILAGE SPEED')
car_ob = car(1000000,20,100)
print(car_ob.PRICE, car_ob.MILAGE, car_ob.SPEED)

1000000 20 100


### 6. Regex pattern to get alternating digit matches

In [13]:
import re
pattern = r"(\d)(?=\d\1)"

In [15]:
re.findall(pattern,'121212') ### note this captures only the alternating match, not the pair

['1', '2', '1', '2']

### 7. Naming groups in regex match
`?P<group name>` can be used to  name the subgroups when doing a regex match

In [20]:
import re
m = re.match(r'(?P<user>\w+)@(?P<website>\w+)\.(?P<extension>\w+)','sreekiran@website.com')
m.groupdict()

{'user': 'sreekiran', 'website': 'website', 'extension': 'com'}

### 8. Find consequtive character matches in regex

In [29]:
import re
s = r'ppppythonisffffun'
### here trying to match 3 or more consequtive characters coming
re.findall(r'((\w)\2{2,})', s)
print([match[0] for match in re.findall(r'((\w)\2{2,})', s)])

['pppp', 'ffff']


### 9. Find matches with overlap

- when we do normal regex match, once a match is found, it wont be considered for finding the next matches

- eg: consider the example AABBAAA, I need to find all the instances where A is occuring twice consequently

In [50]:
import re
re.findall(r'AA','AABBAAA')


['AA', 'AA']

- Here we got only two matches, but ideally the last AAA, has two pairs AA, as the first pair was already considered for matching, we didnt get the second match as only one A was left

- To solve this, we use something call lookahead operator `?=` 
- The lookahead captures the text you're interested in, but the actual match is technically the zero-width substring before the lookahead, so the matches are technically non-overlapping

In [51]:
import re
re.findall(r'(?=(AA))','AABBAAA')

['AA', 'AA', 'AA']

### 10. List in which we can add or remove elements from both ends: Deque


**collections.deque()**:

- A deque is a double-ended queue. It can be used to add or remove elements from both ends.

- Deques support thread safe, memory efficient appends and pops from either side of the deque with approximately the same O(1)  performance in either direction.

In [4]:
from collections import deque

l1 = deque()
l1.append(1)
l1

deque([1])

In [55]:
l1.extend([1,2,3])
l1

deque([1, 1, 2, 3])

In [56]:
l1.appendleft(2)
l1

deque([2, 1, 1, 2, 3])

In [57]:
l1.extendleft([7,8,9])
l1

deque([9, 8, 7, 2, 1, 1, 2, 3])

In [58]:
l1.pop()
l1

deque([9, 8, 7, 2, 1, 1, 2])

In [59]:
l1.popleft()
l1

deque([8, 7, 2, 1, 1, 2])

In [62]:
l1.reverse()
l1

deque([2, 1, 1, 2, 7, 8])

In [63]:
l1.rotate(3)

In [64]:
l1

deque([2, 7, 8, 2, 1, 1])

---

## General Python tips

### 1. Adding an element to a sorted list, maintaining the order.

Adding an element to a sorted list, maintaining the order.
It will be very computationally expensive to sort the list again every time after adding a new element.
There is a Python Standard Library called bisect which can help us here.

In [5]:
import bisect
sorted_list = [1,2,3,5]
new_element = 4
bisect.insort_left(sorted_list,new_element)

In [6]:
sorted_list

[1, 2, 3, 4, 5]

- [Bisect Python documentation](https://docs.python.org/3/library/bisect.html)     
- [LinkedIn_post](https://www.linkedin.com/posts/sreekiranar_bisect-array-bisection-algorithm-activity-6715680837087711232-F6Sk)

### 2. Intersting usage of *sum* function

- Did you know the sum function in Python can be used to flatten a list of lists?

In [4]:
sum([[1, 2, 3],[3, 4, 5],['a', 'b', 'c']], [])

[1, 2, 3, 3, 4, 5, 'a', 'b', 'c']

As they say in the Zen of Python, Flat is better than nested :)
- [Zen of Python](https://www.python.org/dev/peps/pep-0020/)
- [LinkedIn post](https://www.linkedin.com/posts/sreekiranar_python-programming-lessons-activity-6728708314374504448-hmO7)

### 3. Splitting numpy array into N equal chunks

In [8]:
import numpy as np


In [9]:
a = np.arange(9)
np.split(a,3)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

- **np.split** throws an error if the array cannot be split into equal sub arrays
- Use **np.array_split** in those scenarios

In [10]:
a = np.arange(0,8)
np.array_split(a,3)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7])]

When dealing with raw data, especially when doing mathematical computations for Machine Learning and Neural computation, we might need to flatten and reshape the arrays. Here are two useful functions which will help you to acheive your task.

1. **np.squeeze()**: Make your list of lists into a single list

In [11]:
a = np.arange(10).reshape(-1,1,1)
a

array([[[0]],

       [[1]],

       [[2]],

       [[3]],

       [[4]],

       [[5]],

       [[6]],

       [[7]],

       [[8]],

       [[9]]])

In [10]:
np.squeeze(a)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [14]:
a.flatten()

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

### 4. Priting a list in the reverse order without considering the last element

In [12]:
x = [1,2,3,4,5]
### desired output: [4,3,2,1]
x[-2::-1] ### (note the -2)

[4, 3, 2, 1]

### 5. The walrus operator
- There is a syntax in Python called Walrus operator ":=" that assigns values to variables as part of a larger expression.  Interestingly, the operator was affectionately named Walrus due to its resemblance to the eyes and tusks of a walrus! :D

- This can be used to make your code more compact
- For Example

In [2]:
# without walrus
count = 10
if count % 5 == 0:
    print(f"{count} is divisible by 5")


10 is divisible by 5


In [6]:
# with walrus
if (count := 10) % 5 == 0:
    print(f"{count} is divisible by 5")

10 is divisible by 5


For more info: [What’s New In Python 3.8](https://docs.python.org/3/whatsnew/3.8.html#assignment-expressions)

### 6. Progress Tracking made simpler

- Introducing a Python wrapper `tqdm` 
- tqdm derives from the Arabic word taqaddum (تقدّم) which can mean "progress," and is an abbreviation for "I love you so much" in Spanish (te quiero demasiado)


#### Basic usage - Wrap `tqdm` around any iterables

In [7]:
from tqdm import tqdm
from time import sleep

text = ""
for char in tqdm(["how", "long", "will", "it", "take"]):
    ### this can be replaced with any operation required
    sleep(1)
    text = text + char

100%|██████████| 5/5 [00:05<00:00,  1.01s/it]


#### `trange(n)` - single call for tqdm(range(n))

In [8]:
from tqdm import trange

for i in trange(100):
    sleep(0.01)

100%|██████████| 100/100 [00:01<00:00, 85.20it/s]


- For information on customising the progress bar further, visit the official GitHub [page](https://github.com/tqdm/tqdm).