In [29]:
exec(open("../code-activity-templates/utils.py").read())

In [30]:
import numpy as np
import pandas as pd

### 1. `assert_pd_series_equal()` function

If we asked to student create/compute a pandas series and we want to test if the student's series is correct, we can use the `assert_pd_series_equal()` function to compare the student's series with the correct series.

In [31]:
# Series 1
series1 = pd.Series([1, 2, 3, 4, 5])

# Series 2
series2 = pd.Series([6, 7, 8, 9, 10])

In [32]:
# assertions
assert_pd_series_equal(series1, pd.Series([1, 2, 3, 4, 5]))

expected_series2 = pd.Series([6, 7, 8, 9, 10])
assert_pd_series_equal(series2, expected_series2)

In [33]:
# Add the two series together
sum_series = series1 + series2

# assertions
expected_sum_series = pd.Series([7, 9, 11, 13, 15])
assert_pd_series_equal(sum_series, expected_sum_series)

### 2. `assert_pd_dataframe_equals()` function

If we asked to student create/compute a pandas dataframe and we want to test if the student's dataframe is correct, we can use the `assert_pd_dataframe_equals()` function to compare the student's dataframe with the correct dataframe.

In [34]:
df1 = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [6, 7, 8, 9, 10],
    }
)

df2 = pd.DataFrame(
    {
        "A": [11, 12, 13, 14, 15],
        "B": [16, 17, 18, 19, 20],
    }
)

In [35]:
df1

Unnamed: 0,A,B
0,1,6
1,2,7
2,3,8
3,4,9
4,5,10


In [36]:
df2

Unnamed: 0,A,B
0,11,16
1,12,17
2,13,18
3,14,19
4,15,20


In [37]:
# assertions
assert_pd_dataframe_equals(df1, pd.DataFrame({"A": [1, 2, 3, 4, 5], "B": [6, 7, 8, 9, 10]}))

expected_df2 = pd.DataFrame({"A": [11, 12, 13, 14, 15], "B": [16, 17, 18, 19, 20]})
assert_pd_dataframe_equals(df2, expected_df2)

### 3. `assert_pd_series_variable_equals_variable()` function

This function is similar to `assert_pd_series_equal()`, but this function also check the expected variable name and it's type (pandas series) along with the value.

In [38]:
# series1
series1 = pd.Series([1, 2, 3, 4, 5])

# series2
series2 = pd.Series([6, 7, 8, 9, 10])

# Add the two series together
sum_series = series1 + series2

# assertions
expected_sum_series = pd.Series([7, 9, 11, 13, 15])
assert_pd_series_variable_equals_variable('sum_series', 'expected_sum_series')

If the dtype of the expected variable is not pandas series, it will raise an error. Ex:

In [39]:
# Expected is not a series
not_pandas_series = [7, 9, 11, 13, 15]
expected_pandas_series = pd.Series(not_pandas_series)
assert_pd_series_variable_equals_variable('sum_series', 'not_pandas_series')

AssertionError: Your Series doesn't match what's expected: Series Expected type <class 'pandas.core.series.Series'>, found <class 'list'> instead

This raises an error of type `AssertionError`: `Your Series doesn't match what's expected: Series Expected type <class 'pandas.core.series.Series'>, found <class 'list'> instead`

### 4. `assert_pd_series_equals_csv()` function

If we asked to student create/compute a large pandas series and for which we can't create a expected variable to compare with the student's series, we can use the `assert_pd_series_equals_csv()` function to compare the student's series with the correct series stored in a csv file.

In [40]:
# Pandas series
pandas_series = pd.Series([7, 9, 11, 13, 15])

# Expected csv file name
expected_csv_file_name = 'sum_series.csv'

In [43]:
# assertions
assert_pd_series_equals_csv(pandas_series, expected_csv_file_name)
# or
assert_pd_series_equals_csv(pandas_series, 'sum_series.csv')

### 5. `assert_pd_series_variable_equals_csv()` function

`assert_pd_series_variable_equals_csv()` function is similar to `assert_pd_series_equals_csv()`, but this function also check the expected variable name and it's type (pandas series) along with the value.

In [44]:
# taking above variables and csv file
assert_pd_series_variable_equals_csv('pandas_series', 'sum_series.csv')
# or
assert_pd_series_variable_equals_csv('pandas_series', expected_csv_file_name)

### 6. `assert_pd_dataframe_equals_csv()` function

To compare the student's dataframe with the correct dataframe stored in a csv file, we can use the `assert_pd_dataframe_equals_csv()` function.

In [None]:
# DataFrames
df1 = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [6, 7, 8, 9, 10],
    }
)

In [45]:
# save the DataFrame to a csv file
df1.to_csv('activity_solutions_files/df1.csv', index=False)

In [48]:
# assertions
read_csv_kwargs = {'index_col': None}

assert_pd_dataframe_equals_csv(df1, 'df1.csv', read_csv_kwargs=read_csv_kwargs)

we can also pass other parameters also like `index_col`, `usecols`, `squeeze`, `dtype`, `engine`, `true_values`, `false_values`, `skiprows`, `nrows`, `na_values`, `keep_default_na`, `thousands`, `comment`, `skipfooter`, `converters`, `verbose`, `encoding`, `memory_map`, `float_precision`, `storage_options` to the `assert_pd_dataframe_equals_csv()` function to read the csv file correctly.

### 7. `assert_pd_dataframe_variable_equals_csv()` function

`assert_pd_dataframe_variable_equals_csv()` function is similar to `assert_pd_dataframe_equals_csv()`, but this function also check the expected variable name and it's type (pandas dataframe) along with the value.

In [49]:
# using above variables and csv file
assert_pd_dataframe_variable_equals_csv('df1', 'df1.csv', read_csv_kwargs=read_csv_kwargs)

### 8. `assert_pd_dataframe_equals_variable()` function

This function is similar to `assert_pd_dataframe_equals()`, but this function also check the expected variable name and it's type (pandas dataframe) along with the value.

In [50]:
# DataFrames
df1 = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [6, 7, 8, 9, 10],
    }
)

# assertions
expected_df1 = pd.DataFrame({"A": [1, 2, 3, 4, 5], "B": [6, 7, 8, 9, 10]})
assert_pd_dataframe_variable_equals_variable('df1', 'expected_df1')

### 9. `assert_pd_dataframe_variable_equals_variable()` function

This function is similar to `assert_pd_dataframe_equals_variable()`, but this function also deletes the expected variable after the comparison.

In [51]:
# DataFrames
df1 = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [6, 7, 8, 9, 10],
    }
)

# assertions
expected_df1 = pd.DataFrame({"A": [1, 2, 3, 4, 5], "B": [6, 7, 8, 9, 10]})
assert_pd_dataframe_variable_equals_variable('df1', 'expected_df1')

In [53]:
print(expected_df1)

NameError: name 'expected_df1' is not defined

You can see that the expected variable is deleted from the student's environment after the comparison.

### 10. `assert_pd_dataframe_csv_equals_csv()` function

To compare the correct dataframe stored in a csv file with the correct dataframe stored in another csv file, we can use the `assert_pd_dataframe_csv_equals_csv()` function.

In [None]:
# DataFrames


let's create a new csv file `correct.csv` with the following data:

```
A,B,C
1,2,3
4,5,6
7,8,9
```

and then we can use the `assert_pd_dataframe_csv_equals_csv()` function to compare this csv file with the student's csv file.

In [64]:
student_dataframe = pd.DataFrame(
    {
        "A": [1, 4, 7],
        "B": [2, 5, 8],
        "C": [3, 6, 9],
    }
)

# save the DataFrame to a csv file named 'student_dataframe.csv' and 'expected_student_dataframe.csv'
student_dataframe.to_csv('student_dataframe.csv', index=True)
student_dataframe.to_csv('activity_solutions_files/expected_student_dataframe.csv', index=True)



In [65]:
# assertions

assert_pd_dataframe_csv_equals_csv('student_dataframe.csv', 'expected_student_dataframe.csv')

### 12. `assert_pd_dataframe_variable_column_equals_csv()` function

If we asked to student to add a new column or manipulate the column to the dataframe and we want to test if the student's column is added/manipulated correctly, we can use the `assert_pd_dataframe_variable_column_equals_csv()` function to compare the student's dataframe with the correct dataframe.

In [68]:
# dataframes
df1 = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [6, 7, 8, 9, 10],
    }
)

# student added a new column to the DataFrame by adding the two columns together
df1['C'] = df1['A'] + df1['B']

# save the DataFrame to a csv file named 'df1.csv'
df1.to_csv('activity_solutions_files/column_check_df1.csv', index=True)

In [70]:
# assertions
read_csv_kwargs = {'index_col': None}
assert_pd_dataframe_variable_column_equals_csv('df1', 'C', 'column_check_df1.csv')