## Data frame inspection

In [9]:
import pandas as pd 

In [13]:
a=pd.Series() ## here we are creating a series to store elements
## a Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.)

In [15]:
a

Series([], dtype: object)

In [25]:
a=pd.Series(18)

In [27]:
a

0    18
dtype: int64

In [35]:
t=(19,34,57)
a=pd.Series(t)

In [37]:
a

0    19
1    34
2    57
dtype: int64

In [41]:
q=[132,35,68,24,68]
a=pd.Series(q)

In [43]:
a

0    132
1     35
2     68
3     24
4     68
dtype: int64

In [56]:
q.append(t)

In [58]:
q

[132, 35, 68, 24, 68, (19, 34, 57), (19, 34, 57)]

In [63]:
import numpy as np
arr=np.array([1,2,3,4,5])
d1=pd.Series(arr)

In [65]:
d1

0    1
1    2
2    3
3    4
4    5
dtype: int32

In [83]:
## we created a series where we determined the shape of array i.e it has 2 rows and 5 columns
arr2=np.array([[1,3,4,56,7],[34,56,38,9,5]])
arr2.shape

(2, 5)

In [91]:
e=pd.Series(arr2)
## here error occurs because we cannot pass 2 data types or multiple arrays or multiple values in series 

ValueError: Data must be 1-dimensional, got ndarray of shape (2, 5) instead

In [95]:
d={'a':1,'b':2,'c':3,'d':4}
d.values()## fetches the values 

dict_values([1, 2, 3, 4])

In [109]:
d.keys()## used to see the column names for the assigned elements in dict

dict_keys(['a', 'b', 'c', 'd'])

In [111]:
d.items() ## gives all the elements along with their values

dict_items([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

In [116]:
n=pd.Series(d)

In [118]:
n[0]

  n[0]


1

In [120]:
n['b']

2

In [None]:
## slicing operations 

In [133]:
d=pd.Series([1,2,3,5,47,98,7,8,6,32,78,2,8,289,258,78])
d[:8:2]## here its giving values from 0 index to 8 with every 2nd value

0     1
2     3
4    47
6     7
dtype: int64

In [135]:
d[[4,5,3,6]] ## multiple index

4    47
5    98
3     5
6     7
dtype: int64

In [139]:
d[4]=34  ## changing the element
d

0       1
1       2
2       3
3       5
4      34
5      98
6       7
7       8
8       6
9      32
10     78
11      2
12      8
13    289
14    258
15     78
dtype: int64

In [143]:
d[6:10]=[100,200,300,400] ## changing values based on index 
d

0       1
1       2
2       3
3       5
4      34
5      98
6     100
7     200
8     300
9     400
10     78
11      2
12      8
13    289
14    258
15     78
dtype: int64

In [145]:
## here we are assigning column name to each element in arr3
arr3=np.array([1,2,4,6])
s=pd.Series(arr3,index=['one','Two','Three','Four'])   
s

one      1
Two      2
Three    4
Four     6
dtype: int32

In [147]:
arr4=np.array([1,2,4,6])
s1=pd.Series(arr4)   
s1

0    1
1    2
2    4
3    6
dtype: int32

In [153]:
import pandas as pd
## empty dataframe
d=pd.DataFrame()
print(d)

Empty DataFrame
Columns: []
Index: []


In [157]:
data=[['abc',1],['def',2],['ghi',3],['','hij'],['klm','a']]

In [161]:
d1=pd.DataFrame(data)
d1

Unnamed: 0,0,1
0,abc,1
1,def,2
2,ghi,3
3,,hij
4,klm,a


In [165]:
d=pd.DataFrame(data,columns=['A','B'],index=['A','B','C','D','E'])  ## changing the index
d

Unnamed: 0,A,B
A,abc,1
B,def,2
C,ghi,3
D,,hij
E,klm,a


To get a strong grasp of **pandas** for data science interviews, focus on mastering these key functions:

### DataFrame Creation and Inspection
1. **`pd.DataFrame()`** - Create DataFrames.
2. **`df.head()`** - View the first few rows.
3. **`df.tail()`** - View the last few rows.
4. **`df.info()`** - Overview of DataFrame structure.
5. **`df.describe()`** - Summary statistics.

### Selection and Indexing
6. **`df.iloc[]`** - Select by row/column indices.
7. **`df.loc[]`** - Select by labels.
8. **`df.at[]` / `df.iat[]`** - Fast access for single elements.
9. **`df.set_index()`** - Set a column as an index.
10. **`df.reset_index()`** - Reset index to default.

### Data Cleaning
11. **`df.isnull()`** / **`df.notnull()`** - Check for missing values.
12. **`df.fillna()`** - Fill missing values.
13. **`df.dropna()`** - Drop rows/columns with missing values.
14. **`df.drop_duplicates()`** - Remove duplicate rows.
15. **`df.replace()`** - Replace values.
16. **`df.astype()`** - Change data types.

### Filtering and Sorting
17. **`df.query()`** - Query the DataFrame using a string expression.
18. **`df.filter()`** - Subset rows or columns.
19. **`df.sort_values()`** - Sort by column values.
20. **`df.sort_index()`** - Sort by index.

### Grouping and Aggregation
21. **`df.groupby()`** - Group data and apply aggregation functions.
22. **`df.agg()`** - Perform multiple aggregation operations.
23. **`df.transform()`** - Apply functions element-wise.
24. **`df.pivot_table()`** - Create pivot tables.
25. **`df.crosstab()`** - Cross-tabulation of factors.

### Merging and Joining
26. **`pd.concat()`** - Concatenate DataFrames.
27. **`pd.merge()`** - Merge DataFrames based on keys.
28. **`df.join()`** - Join DataFrames on index or key.

### Handling Time Series Data
29. **`pd.to_datetime()`** - Convert to datetime.
30. **`df.resample()`** - Resample time series data.
31. **`df.shift()`** - Shift data in time series.

### Input/Output
32. **`pd.read_csv()` / `df.to_csv()`** - Read from and write to CSV.
33. **`pd.read_excel()` / `df.to_excel()`** - Read from and write to Excel.
34. **`pd.read_sql()` / `df.to_sql()`** - Read from and write to SQL databases.

### Visualization
35. **`df.plot()`** - Basic plotting using matplotlib.

These functions should give you a solid foundation in pandas, preparing you for most data science tasks and interviews.