In [2]:
import numpy as np
import pandas as pd


# What are vectorized operations
Vectorized operations refer to performing operations on entire arrays or vectors of data at once, rather than using loops to operate on individual elements one at a time. This concept is especially powerful in languages and libraries like Python with NumPy, MATLAB, R, and others that support array programming.

In [5]:
  a = np.array([1,2,3,4])
  a * 4

array([ 4,  8, 12, 16])

Great question. **Vanilla Python** (i.e., Python without libraries like NumPy or pandas) doesn't support **true vectorized operations**, and this leads to several problems:

---

### ❌ Problems with Vectorized Operations in Vanilla Python

#### 1. **No Native Support for Arrays**

* Python lists are **not designed** for numerical computation.
* Operations like `a + b` on lists **concatenate** instead of adding element-wise.

```python
a = [1, 2, 3]
b = [4, 5, 6]
print(a + b)  # Output: [1, 2, 3, 4, 5, 6] → Not element-wise addition!
```

#### 2. **Slow Performance**

* Python loops are **interpreted** and **not optimized** for heavy computation.
* Element-wise operations must be done with `for` loops or list comprehensions, which are **significantly slower** than vectorized NumPy code.

#### 3. **No Broadcasting**

* Python lists don't support broadcasting rules like NumPy does (e.g., adding a scalar to a list of numbers doesn't work as expected).

```python
a = [1, 2, 3]
print(a + 5)  # TypeError
```

#### 4. **Manual Loops Increase Complexity**

* You have to explicitly write loops or use list comprehensions for operations that could be one-liners in NumPy.

```python
# Element-wise multiplication in vanilla Python
a = [1, 2, 3]
b = [4, 5, 6]
c = [x * y for x, y in zip(a, b)]  # More verbose
```

---

### ✅ Solution: Use Libraries Like NumPy

NumPy arrays allow you to write **clean, fast, and memory-efficient** code that mimics mathematical notation.

---

Would you like an example comparing performance between vanilla Python and NumPy for a large computation?


In [10]:
#this is not work in vanillla python
s = ['cat','mat',None,'rat']

[i.startswith('c') for i in s]

AttributeError: 'NoneType' object has no attribute 'startswith'

In [12]:
# How pandas solves this issue?

s = pd.Series(['cat','mat',None,'rat'])
# string accessor
s.str.startswith('c')

# fast and optimized

0     True
1    False
2     None
3    False
dtype: object

In [14]:
df=pd.read_csv('titanic.csv')
df

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [18]:
#lower
df['Name'].str.lower()

0                                braund, mr. owen harris
1      cumings, mrs. john bradley (florence briggs th...
2                                 heikkinen, miss. laina
3           futrelle, mrs. jacques heath (lily may peel)
4                               allen, mr. william henry
                             ...                        
886                                montvila, rev. juozas
887                         graham, miss. margaret edith
888             johnston, miss. catherine helen "carrie"
889                                behr, mr. karl howell
890                                  dooley, mr. patrick
Name: Name, Length: 891, dtype: object

In [20]:
#upper
df['Name'].str.upper()

0                                BRAUND, MR. OWEN HARRIS
1      CUMINGS, MRS. JOHN BRADLEY (FLORENCE BRIGGS TH...
2                                 HEIKKINEN, MISS. LAINA
3           FUTRELLE, MRS. JACQUES HEATH (LILY MAY PEEL)
4                               ALLEN, MR. WILLIAM HENRY
                             ...                        
886                                MONTVILA, REV. JUOZAS
887                         GRAHAM, MISS. MARGARET EDITH
888             JOHNSTON, MISS. CATHERINE HELEN "CARRIE"
889                                BEHR, MR. KARL HOWELL
890                                  DOOLEY, MR. PATRICK
Name: Name, Length: 891, dtype: object

In [26]:
#capitalize
df['Name'].str.capitalize()

0                                Braund, mr. owen harris
1      Cumings, mrs. john bradley (florence briggs th...
2                                 Heikkinen, miss. laina
3           Futrelle, mrs. jacques heath (lily may peel)
4                               Allen, mr. william henry
                             ...                        
886                                Montvila, rev. juozas
887                         Graham, miss. margaret edith
888             Johnston, miss. catherine helen "carrie"
889                                Behr, mr. karl howell
890                                  Dooley, mr. patrick
Name: Name, Length: 891, dtype: object

In [28]:
#title
df['Name'].str.title()

0                                Braund, Mr. Owen Harris
1      Cumings, Mrs. John Bradley (Florence Briggs Th...
2                                 Heikkinen, Miss. Laina
3           Futrelle, Mrs. Jacques Heath (Lily May Peel)
4                               Allen, Mr. William Henry
                             ...                        
886                                Montvila, Rev. Juozas
887                         Graham, Miss. Margaret Edith
888             Johnston, Miss. Catherine Helen "Carrie"
889                                Behr, Mr. Karl Howell
890                                  Dooley, Mr. Patrick
Name: Name, Length: 891, dtype: object

In [38]:
#len 
df[df['Name'].str.len()==df['Name'].str.len().max()]['Name']

307    Penasco y Castellana, Mrs. Victor de Satode (M...
Name: Name, dtype: object

In [42]:
#strip use to trim spaces leading or trailing
df['Name'].str.strip()

0                                Braund, Mr. Owen Harris
1      Cumings, Mrs. John Bradley (Florence Briggs Th...
2                                 Heikkinen, Miss. Laina
3           Futrelle, Mrs. Jacques Heath (Lily May Peel)
4                               Allen, Mr. William Henry
                             ...                        
886                                Montvila, Rev. Juozas
887                         Graham, Miss. Margaret Edith
888             Johnston, Miss. Catherine Helen "Carrie"
889                                Behr, Mr. Karl Howell
890                                  Dooley, Mr. Patrick
Name: Name, Length: 891, dtype: object

In [48]:
df['Surname']=df['Name'].str.split(',').str.get(0)
df['Surname']

0         Braund
1        Cumings
2      Heikkinen
3       Futrelle
4          Allen
         ...    
886     Montvila
887       Graham
888     Johnston
889         Behr
890       Dooley
Name: Surname, Length: 891, dtype: object

In [52]:

df['title']=df['Name'].str.split(',').str.get(1).str.strip().str.split(' ').str.get(0)
df['FirstName']=df['Name'].str.split(',').str.get(1).str.strip().str.split(' ').str.get(1)

df[['title','FirstName','Surname']]

Unnamed: 0,title,FirstName,Surname
0,Mr.,Owen,Braund
1,Mrs.,John,Cumings
2,Miss.,Laina,Heikkinen
3,Mrs.,Jacques,Futrelle
4,Mr.,William,Allen
...,...,...,...
886,Rev.,Juozas,Montvila
887,Miss.,Margaret,Graham
888,Miss.,Catherine,Johnston
889,Mr.,Karl,Behr


In [54]:
#n tells us how many split we have to perform
df['title']=df['Name'].str.split(',').str.get(1).str.strip().str.split(' ',n=1).str.get(0)
df['FirstName']=df['Name'].str.split(',').str.get(1).str.strip().str.split(' ',n=1).str.get(1)

df[['title','FirstName','Surname']]

Unnamed: 0,title,FirstName,Surname
0,Mr.,Owen Harris,Braund
1,Mrs.,John Bradley (Florence Briggs Thayer),Cumings
2,Miss.,Laina,Heikkinen
3,Mrs.,Jacques Heath (Lily May Peel),Futrelle
4,Mr.,William Henry,Allen
...,...,...,...
886,Rev.,Juozas,Montvila
887,Miss.,Margaret Edith,Graham
888,Miss.,"Catherine Helen ""Carrie""",Johnston
889,Mr.,Karl Howell,Behr


In [58]:
#expand can also used to directly convert list to different columns
df[['title','FirstName']]=df['Name'].str.split(',').str.get(1).str.strip().str.split(' ',n=1,expand=True)
df

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,Surname,title,FirstName
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S,Braund,Mr.,Owen Harris
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C,Cumings,Mrs.,John Bradley (Florence Briggs Thayer)
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S,Heikkinen,Miss.,Laina
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S,Futrelle,Mrs.,Jacques Heath (Lily May Peel)
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S,Allen,Mr.,William Henry
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S,Montvila,Rev.,Juozas
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S,Graham,Miss.,Margaret Edith
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S,Johnston,Miss.,"Catherine Helen ""Carrie"""
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C,Behr,Mr.,Karl Howell


In [60]:
df['title'].value_counts()

Mr.          517
Miss.        182
Mrs.         125
Master.       40
Dr.            7
Rev.           6
Mlle.          2
Major.         2
Col.           2
the            1
Capt.          1
Ms.            1
Sir.           1
Lady.          1
Mme.           1
Don.           1
Jonkheer.      1
Name: title, dtype: int64

In [62]:
df['title']=df['title'].str.replace('Ms.','Miss.')
df['title']=df['title'].str.replace('Mlle.','Miss.')

  df['title']=df['title'].str.replace('Ms.','Miss.')
  df['title']=df['title'].str.replace('Mlle.','Miss.')


In [64]:
df['title'].value_counts()

Mr.          517
Miss.        185
Mrs.         125
Master.       40
Dr.            7
Rev.           6
Major.         2
Col.           2
Don.           1
Mme.           1
Lady.          1
Sir.           1
Capt.          1
the            1
Jonkheer.      1
Name: title, dtype: int64

In [72]:
#filtering
df['FirstName'].str.startswith('A')

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: FirstName, Length: 891, dtype: bool

In [76]:
df['FirstName'].str.endswith('A')

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: FirstName, Length: 891, dtype: bool

In [78]:
df['FirstName'].str.isdigit()

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
887    False
888    False
889    False
890    False
Name: FirstName, Length: 891, dtype: bool

In [82]:
#case make it case-insenstitive
df[df['FirstName'].str.contains('john',case=False)]['FirstName']

1            John Bradley (Florence Briggs Thayer)
41     William John Robert (Dorothy Ann Wonnacott)
45                                    William John
98                         John T (Ada Julia Bone)
112                                     David John
117                            William John Robert
160                                  John Hatfield
162                                    John Viktor
165                   Frank John William "Frankie"
168                                         John D
188                                           John
212                                     John Henry
226                                   William John
227                            John Hall ("Henry")
324                                 George John Jr
328                 Frank John (Emily Alice Brown)
401                                           John
418                                   William John
467                                John Montgomery
527                            

In [92]:
df[df['FirstName'].str.contains('^[aeiouAEIOU].+[aeiouAEIOU]$')]['FirstName']

16                     Eugene
38              Augusta Maria
61                     Amelie
64                   Albert A
68             Erna Alexandra
80                    Achille
106             Anna Kristine
119          Ellis Anna Maria
128                      Anna
135                     Emile
141                Anna Sofia
152                   Alfonzo
164              Eino Viljami
195                     Elise
216                    Eliina
218                    Albina
235              Alice Phoebe
246    Agda Thorilda Viktoria
258                      Anna
269                    Amelia
276         Augusta Charlotta
293                   Aloisia
298                   Adolphe
311               Emily Borie
363                     Adola
368                     Annie
376             Aurora Adelia
396                     Elina
409                       Ida
474                 Ida Sofia
520                      Anne
541       Ingeborg Constanzia
566                      Ilia
615       

# Silcing

In [100]:
df['Name'].str[:4]

0      Brau
1      Cumi
2      Heik
3      Futr
4      Alle
       ... 
886    Mont
887    Grah
888    John
889    Behr
890    Dool
Name: Name, Length: 891, dtype: object

In [102]:
df['Name'].str[:4:2]

0      Ba
1      Cm
2      Hi
3      Ft
4      Al
       ..
886    Mn
887    Ga
888    Jh
889    Bh
890    Do
Name: Name, Length: 891, dtype: object

In [104]:
df['Name'].str[::-1]

0                                sirraH newO .rM ,dnuarB
1      )reyahT sggirB ecnerolF( yeldarB nhoJ .srM ,sg...
2                                 aniaL .ssiM ,nenikkieH
3           )leeP yaM yliL( htaeH seuqcaJ .srM ,ellertuF
4                               yrneH mailliW .rM ,nellA
                             ...                        
886                                sazouJ .veR ,alivtnoM
887                         htidE teragraM .ssiM ,maharG
888             "eirraC" neleH enirehtaC .ssiM ,notsnhoJ
889                                llewoH lraK .rM ,rheB
890                                  kcirtaP .rM ,yelooD
Name: Name, Length: 891, dtype: object