[Pandas DataFrame](https://thispointer.com/pandas-dataframe-loc/)

Select rows of Dataframe based on Callable function
Select rows of Dataframe based on bool array
Select a few Columns from Dataframe
Select multiple columns from Dataframe based on list of names
Select multiple columns from Dataframe based on name range
Select subset of Dataframe based on row/column names in list

Select a few rows from Dataframe, but include all column values

    Select a single row of Dataframe

    Select rows of Dataframe based on row label names in list

    Select rows of Dataframe based on row label name range

    Select rows of Dataframe based on bool array

    Select rows of Dataframe based on callable function

Select a subset of Dataframe with few rows and columns

    Select a Cell value from Dataframe
    
    Select subset of Dataframe based on row/column names in list
    
    Select subset of Dataframe based on row and column name range.

[Select DataFrame rows based on multiple conditions](https://thispointer.com/pandas-tutorial-part-9-filter-dataframe-rows/)
```
# Select only those rows where sale
# value is between 30 and 40
df = df[(df['Sale'] > 30) & (df['Sale'] < 40)]
# Display the DataFrame
print(df)
```

[Get statistics for each group based on a single column](https://thispointer.com/get-statistics-for-each-group-using-pandas-groupby/)

```
# get avg. experience by team
print (df.groupby("Team")["Experience"].mean())
```


Get statistics for each group based on multiple columns

```
# get avg. experience by team
print (df.groupby(["Team", "Country"])["Experience"].describe())
```

# Get statistics for each group using multiple aggregations

The above example to include multiple columns was simple since the aggregation we wanted to do was same for both the columns. But in case, let’s say, we wanted to take the average of Experience but the median of the Age column by Team. In such cases, the agg function comes in handy.

# get avg. experience and median age by team
```
print (df.groupby("Team").agg({
    'Experience':'mean',
    'Age': 'median'
}))
```

We could also use multiple aggregations for the same column. For example, let’s calculate the both mean and medium for the Experience column.

# get avg. and median experience, and median age by team
```
print (df.groupby("Team").agg({
    'Experience':['mean', 'median'],
    'Age': 'median'
}))
```

# Get statistics for each group using the apply() method

The apply method is also similar to the agg method, but the apply function provides more flexibility to write custom functions to get the statistics by each group. For example, let’s create a custom function to get the mean of Experience for each Team group.

# using the apply method
```
def get_mean(x):
    return x.mean()
print (df.groupby('Team')['Experience'].apply(get_mean))
```

# Get statistics for each group using the transform() method

Another interesting way to calculate statistics for each group is using the transform method. The advantage of using this method calculates the statistic across each group and then populates it back to all the rows in the order of the original row indices. This is extremely useful when we want to store the statistic in the same DataFrame, as it avoids an additional step to join the statistics back to the original DataFrame.

For example, let’s again take the average of Experience column by Team and store it as a new column “avg_age_by_team” in the DataFrame.


# using the transform method
```
df['avg_age_by_team'] = df.groupby('Team')['Age'].transform('mean')
print (df)
```

At the most basic level, a custom function requires only five components:

1. Use the def keyword to begin the function definition.
2. Name your function.
3. Supply one or more parameters. ...
4. Enter lines of code that make your function do whatever it does. ...
5. Use the return keyword at the end of the function to return the output.

[Lambda Function](https://www.w3resource.com/python/python-user-defined-functions.php)

# Lambda Forms:

In Python, small anonymous (unnamed) functions can be created with lambda keyword. Lambda forms can be used as an argument to other function where function objects are required but syntactically they are restricted to a single expression. A function like this:

```
def average(x, y):
    return (x + y)/2
print(average(4, 3))

# may also be defined using lambda

print((lambda x, y: (x + y)/2)(4, 3))

```

[Writing Parquet Files in Python with Pandas, PySpark, and Koalas](https://mungingdata.com/python/writing-parquet-pandas-pyspark-koalas/)

In [1]:
!python --version


Python 3.9.7


In [1]:
int_list = [1, 2, 3, 4, 5, 6]
sum = 0
for iter in int_list:
    sum += iter
print("Sum =", sum)
print("Avg =", sum/len(int_list))

Sum = 21
Avg = 3.5
