30. Filter Words with Vowels

Write a Pandas program to filter words from a given series that contain at least two vowels.



In [75]:
colors = pd.Series(['red', 'green', 'blue', 'yellow', 'orange', 'purple', 'pink', 'brown', 'black', 'white'])
print("Color Series:")
print(colors)

# Filter words with at least two vowels using a lambda function
filtered = colors[colors.apply(lambda x: sum(1 for c in x.lower() if c in 'aeiou') >= 2)]
print("\nWords with at least two vowels:")
print(filtered)

Color Series:
0       red
1     green
2      blue
3    yellow
4    orange
5    purple
6      pink
7     brown
8     black
9     white
dtype: object

Words with at least two vowels:
1     green
2      blue
3    yellow
4    orange
5    purple
9     white
dtype: object


31. Euclidean Distance

Write a Pandas program to compute the Euclidean distance between two given series.

Euclidean distance
From Wikipedia, In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm.



In [80]:
import pandas as pd
import numpy as np

# Create two example Series
s1 = pd.Series(np.random.randint(1, 10, size=5))
s2 = pd.Series(np.random.randint(10, 20, size=5))

print("Series 1:")
print(s1)
print("\nSeries 2:")
print(s2)

# Compute Euclidean distance
distance = np.linalg.norm(s1 - s2)
print("\nEuclidean distance between s1 and s2:", distance)

Series 1:
0    6
1    5
2    2
3    8
4    5
dtype: int32

Series 2:
0    14
1    14
2    18
3    13
4    14
dtype: int32

Euclidean distance between s1 and s2: 22.516660498395403


32. Find Surrounded Peaks

Write a Pandas program to find the positions of the values neighboured by smaller values on both sides in a given series.



In [99]:
s = pd.Series(np.random.randint(1, 100, size=20))
print("\nRandom Series:")
print(s)

# Find positions where the value is greater than its neighbors (local peaks)
peaks = s[(s.shift(1) < s) & (s.shift(-1) < s)]
print("\nPositions of values neighboured by smaller values on both sides (peaks):")
print(peaks)


Random Series:
0     20
1     82
2     54
3     25
4     39
5     33
6     27
7     17
8     38
9     70
10     9
11    68
12    33
13    20
14    78
15    95
16    72
17    55
18    71
19    28
dtype: int32

Positions of values neighboured by smaller values on both sides (peaks):
1     82
4     39
9     70
11    68
15    95
18    71
dtype: int32


33. Replace Spaces

Write a Pandas program to replace missing white spaces in a given string with the least frequent character.



In [107]:
# Example string with spaces
text = "pandas is awesome"

# Convert string to Series of characters
s = pd.Series(list(text))

# Find the least frequent character (excluding spaces)
least_freq_char = s[s != ' '].value_counts().idxmin()

# Replace spaces with the least frequent character
result = text.replace(" ", least_freq_char)

print("Original string:", text)
print("Least frequent character:", least_freq_char)
print("After replacement:", result)

Original string: pandas is awesome
Least frequent character: p
After replacement: pandaspispawesome


34. Autocorrelation

Write a Pandas program to compute the autocorrelations of a given numeric series.

From Wikipedia:
Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them.



In [112]:
s = pd.Series(np.random.randn(100))

# Compute autocorrelations for lags 1 through 10
print("Autocorrelations for lags 1 to 10:")
for lag in range(1, 11):
    print(f"Lag {lag}: {s.autocorr(lag):.4f}")

Autocorrelations for lags 1 to 10:
Lag 1: 0.0545
Lag 2: 0.0758
Lag 3: -0.0668
Lag 4: 0.0255
Lag 5: -0.0275
Lag 6: -0.0120
Lag 7: -0.0051
Lag 8: -0.0669
Lag 9: 0.0070
Lag 10: -0.1251


35. Sundays TimeSeries

Write a Pandas program to create a TimeSeries to display all the Sundays of given year.

In [68]:
# Set the year you want
year = 2024

# Generate all Sundays of the given year
sundays = pd.date_range(start=f'{year}-01-01', end=f'{year}-12-31', freq='W-SUN')
print(f"All Sundays in {year}:")
print(sundays)

All Sundays in 2024:
DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04',
               '2024-01-05', '2024-01-06', '2024-01-07', '2024-01-08',
               '2024-01-09', '2024-01-10',
               ...
               '2024-12-22', '2024-12-23', '2024-12-24', '2024-12-25',
               '2024-12-26', '2024-12-27', '2024-12-28', '2024-12-29',
               '2024-12-30', '2024-12-31'],
              dtype='datetime64[ns]', length=366, freq='D')


36. Series to DataFrame

Write a Pandas program to convert given series into a dataframe with its index as another column on the dataframe.

In [62]:
s = pd.Series(range(10, 20), index=list('abcdefghij'))
print("Original Series:")
print(s)

# Convert series to DataFrame with index as a column
df = s.reset_index()
df.columns = ['index', 'value']
print("\nDataFrame with index as a column:")
print(df)

Original Series:
a    10
b    11
c    12
d    13
e    14
f    15
g    16
h    17
i    18
j    19
dtype: int64

DataFrame with index as a column:
  index  value
0     a     10
1     b     11
2     c     12
3     d     13
4     e     14
5     f     15
6     g     16
7     h     17
8     i     18
9     j     19


37. Stack Series

Write a Pandas program to stack two given series vertically and horizontally.

In [63]:
s1 = pd.Series(range(10))
print("Series 1:")
print(s1)

s2 = pd.Series(list("abcdefghij"))
print("\nSeries 2:")
print(s2)

# Stack vertically (one below the other)
vertical_stack = pd.concat([s1, s2], axis=0)
print("\nStacked Vertically:")
print(vertical_stack)

# Stack horizontally (side by side as columns)
horizontal_stack = pd.concat([s1, s2], axis=1)
print("\nStacked Horizontally:")
print(horizontal_stack)

Series 1:
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int64

Series 2:
0    a
1    b
2    c
3    d
4    e
5    f
6    g
7    h
8    i
9    j
dtype: object

Stacked Vertically:
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
0    a
1    b
2    c
3    d
4    e
5    f
6    g
7    h
8    i
9    j
dtype: object

Stacked Horizontally:
   0  1
0  0  a
1  1  b
2  2  c
3  3  d
4  4  e
5  5  f
6  6  g
7  7  h
8  8  i
9  9  j


38. Series Equality Check

Write a Pandas program to check the equality of two given series.

In [64]:
s1 = pd.Series(np.random.randint(10, size=10))
print("Series 1:")
print(s1)

s2 = pd.Series(np.random.randint(10, size=10))
print("\nSeries 2:")
print(s2)

print("\nAre the two series equal?")
print(s1.equals(s2))

Series 1:
0    7
1    1
2    8
3    2
4    8
5    0
6    4
7    3
8    8
9    6
dtype: int32

Series 2:
0    7
1    9
2    6
3    8
4    4
5    6
6    4
7    7
8    1
9    6
dtype: int32

Are the two series equal?
False


39. Index of Extremes

Write a Pandas program to find the index of the first occurrence of the smallest and largest value of a given series.

In [65]:
s = pd.Series(np.random.randint(10, size=10))
print("Original Series:")
print(s)

print("\nIndex of the first occurrence of the smallest value:")
print(f"Value: {s.min()}, Index: {s.idxmin()}")

print("\nIndex of the first occurrence of the largest value:")
print(f"Value: {s.max()}, Index: {s.idxmax()}")

Original Series:
0    1
1    3
2    5
3    8
4    7
5    6
6    4
7    7
8    0
9    0
dtype: int32

Index of the first occurrence of the smallest value:
Value: 0, Index: 8

Index of the first occurrence of the largest value:
Value: 8, Index: 3


40. DataFrame-Series Inequality

Write a Pandas program to check inequality over the index axis of a given dataframe and a given series.

In [66]:
# Create the DataFrame
df = pd.DataFrame({
    'W': [68.0, 75.0, 86.0, 80.0, np.nan],
    'X': [78.0, 75.0, np.nan, 80.0, 86.0],
    'Y': [84, 94, 89, 86, 86],
    'Z': [86, 97, 96, 72, 83]
})

# Create the Series from column 'W'
s = pd.Series([68.0, 75.0, 86.0, 80.0, np.nan]) 

# Output
print("Original DataFrame:")
print(df)
print("\nOriginal Series:")
print(s)
print("\nDataFrame not equal to Series along columns:")
print(df.ne(s, axis=0))
print("\nDataFrame not equal to Series along rows:")
print(df.ne(s, axis=1))

Original DataFrame:
      W     X   Y   Z
0  68.0  78.0  84  86
1  75.0  75.0  94  97
2  86.0   NaN  89  96
3  80.0  80.0  86  72
4   NaN  86.0  86  83

Original Series:
0    68.0
1    75.0
2    86.0
3    80.0
4     NaN
dtype: float64

DataFrame not equal to Series along columns:
       W      X     Y     Z
0  False   True  True  True
1  False  False  True  True
2  False   True  True  True
3  False  False  True  True
4   True   True  True  True

DataFrame not equal to Series along rows:
      W     X     Y     Z     0     1     2     3     4
0  True  True  True  True  True  True  True  True  True
1  True  True  True  True  True  True  True  True  True
2  True  True  True  True  True  True  True  True  True
3  True  True  True  True  True  True  True  True  True
4  True  True  True  True  True  True  True  True  True
