<img src="../../../images/banners/pandas-cropped.jpeg" width="600"/>

<a class="anchor" id="essential_basic_functionality"></a>
# <img src="../../../images/logos/pandas.png" width="23"/>  Essential Basic Functionality (Problems)

**Question:**  
How can you create a range of date in pandas?

**Answer:**  
In Pandas, you can create a range of dates using the `date_range()` function. Here's an example:
```
date_range = pd.date_range(start='2023-01-01', end='2023-01-31')
print(date_range)
```
In this example, we used the date_range() function to create a range of dates from January 1, 2023 to January 31, 2023. We specified the start and end dates using the start and end parameters, respectively.

The date_range() function returns a `DatetimeIndex` object, which is a type of index used for date and time data. By default, the freq parameter is set to 'D', which means that the range of dates will be in daily frequency. You can specify other frequencies like 'W' for weekly frequency or 'M' for monthly frequency by setting the freq parameter accordingly.

---

**Question:**  
Which pandas methods show you a summarized information of a DataFrame?

**Answer:**  
Pandas provides several methods to display summarized information about a DataFrame. Here are some of the most commonly used methods:
- `.info()`: Provides a summary of the DataFrame including the number of non-null values, data type of each column, and memory usage.
- `.describe()`: Generates descriptive statistics for the DataFrame including count, mean, standard deviation, minimum value, maximum value, and quartiles.
- `.head()`: Displays the first n rows of the DataFrame (by default n=5).
- `.tail()`: Displays the last n rows of the DataFrame (by default n=5).
- `.shape`: Returns a tuple representing the dimensions of the DataFrame.
- `.dtypes`: Returns a Series with the data type of each column.
- `.columns`: Returns a list of column names.

---

**Question:**  
What are flexible binary operations in pandas?

**Answer:**  
In Pandas, flexible binary operations are operations that can be performed between two Pandas objects (Series or DataFrames) with different indexes or between a Pandas object and a scalar value, and the result will be automatically aligned and broadcasted across the input objects. This means that Pandas will try to match the indexes of the two objects and perform the operation only on the matching elements.

The flexible binary operations in Pandas include:
- `Arithmetic operations`: +, -, *, /, //, %, **
- `Comparison operations`: ==, !=, <, >, <=, >=
- `Logical operations`: & (and), | (or), ^ (xor), ~ (not)
- `Reduction operations`: sum(), mean(), min(), max(), count(), std(), var(), median(), quantile(), corr(), cov(), dot(), all(), any()

---

**Question:**  
What is the meaning of **data alignment** in pandas?

**Answer:**  
Data alignment in Pandas refers to the process of matching the indexes of two or more Pandas objects (Series or DataFrames) before performing an operation between them. When two objects have the same index, the operation is performed on the corresponding elements of the objects. When the objects have different indexes, Pandas aligns them by matching the labels of the indexes and performs the operation only on the matching elements.

Data alignment is an important feature of Pandas because it allows you to perform operations between data that may have different shapes or indexes, without having to worry about manually aligning the data. This makes it easier to work with data that may be incomplete or have missing information, and helps to ensure that the results of operations are accurate and meaningful.


---

**Question:**  
A students wants to add the values of columns **'one'** to the all values of DataFrame and tries the following code:
```
df = pd.DataFrame(
    {"one": [10, 20, 30],"two": [100, 200, 300],}
)
df = df + df['one']
print(df)
```
Unfortunately the output is as follows:
```
   one  two   0   1   2
0  NaN  NaN NaN NaN NaN
1  NaN  NaN NaN NaN NaN
2  NaN  NaN NaN NaN NaN
```
Can you help him to solve the problem?

**Answer:**  
By default `+` operator performs on columns (axis=columns). If you want to change the axis,you have to use `.add()` method and set the `axis` equal to `rows` :
```
df = df.add(df['one'], axis='rows')
print(df)
   one  two
0   20  110
1   40  220
2   60  330
```


---

**Question:**  
Can you explain what will happend when you use `fill_value` option in arithmetic functions in pandas?

**Answer:**  
This option there is in functions like: `add()`, `sub()` and so on. Pandas fill existing missing (NaN) values, and any new element needed for successful DataFrame alignment, with this value before computation. If data in both corresponding DataFrame locations is missing the result will be missing.


---

**Question:**  
What is `.fillna()` method use for?

**Answer:**  
The `.fillna()` method replaces the `NULL` values with a specified value.

The `.fillna()` method returns a new DataFrame object unless the inplace parameter is set to `True`, in that case the fillna() method does the replacing in the original DataFrame instead.


---

**Question:**  
What is the difference between `.all()` and `.any()` method in pandas?

**Answer:**  
- `.all()` does a logical **and** operation on a row or column of a DataFrame and returns the resultant Boolean value.
- `.any()` does a logical **or** operation on a row or column of a DataFrame and returns the resultant Boolean value.


---

**Question:**  
Can you explain why the result of the following code is `False`? How can you check the equality of two pandas DataFrames which have `NaN` values?
```
df1 = pd.DataFrame([10, 20, np.nan], index=list('abc'))
df2 = df1
print((df2 == df1).all())
0    False
dtype: bool
```
**Answer:**  
This is because `NaNs` do not compare as equals. You can use `.equals()` method for testing equality, with NaNs in corresponding locations treated as equal.
```
df2.equals(df1)
True
```

---

**Question:**  
What does `df1.combine_first(df2)` really works?

**Answer:**  
Combine two DataFrame objects by filling null values in df1 DataFrame with non-null values from df2 DataFrame. The row and column indexes of the resulting DataFrame will be the union of the two.

---