# A Primer on Python Libraries and Frameworks

Python libraries and frameworks are essential tools that provide pre-written code to help with specific tasks or types of work. Note that libraries have `dependencies` and may require other packages to be installed. Installation information can be cound on the support websites for each library. For typical scientific computing libraries, installation and dependencies are pretty straightforward (e.g., numpy, pandas, etc). For others, resolving dependencies can be fairly confusing. 

## Environments
- **Definition**: An `environment` is a separate set of folders that you can install a specific version of python in along with all its dependencies.
- **Purpose**: This practice is essential for protecting the `base python install` that your operating systems uses for stuff and also serves as a way to have a standalone python ecosystem that can be tailored to specific versions that you might need for compatibility.
- **Examples**: Using the Anaconda navigator add an environment called my_env. It's pretty intuitive, but here are some [additional instructions](https://docs.anaconda.com/free/navigator/tutorials/manage-environments/). If something goes wrong, you can simply remove it and start fresh. `I strongly recommend not using your 'base' environment` for your python programming. 

## Libraries
- **Definition**: A library in Python is a collection of modules and packages that offer pre-written code to perform common tasks.
- **Purpose**: Libraries are used to extend the functionality of Python, allowing for code reuse and modular programming.
- **Examples**: 
  - `numpy` for numerical operations.
  - `pandas` for data manipulation and analysis.
  - `matplotlib` for data visualization.

## Frameworks
- **Definition**: A framework is a collection of libraries and pre-written code that provides a foundation structure for developing specific types of applications.
- **Purpose**: Frameworks dictate the flow of control in applications and provide generic functionalities, which users can extend or override. `For typical scientific computing, you won't need to worry about frameworks`. 
- **Examples**:
  - `Django` and `Flask` for web development.
  - `PyQT` for user graphic user interfaces (GUIs).

## Key Differences between libraries and frameworks
- **Control Flow**: In a library, the control flow is dictated by the programmer. In a framework, the control flow is dictated by the framework (Inversion of Control).
- **Use Case**: Libraries are used when you need specific functionalities or utilities, whereas frameworks are used as a foundation for building applications.

## Integration
- **Complementing Each Other**: Libraries can be used within frameworks to enhance the functionality of applications.
- **Flexibility**: Python's ecosystem allows for seamless integration between different libraries and frameworks, making it a versatile choice for various types of projects.

## Conclusion
- Libraries and frameworks in Python significantly reduce development time and effort by providing reusable code and other features.
- For scientific computing, understanding the concept of libraries is important, you will have to import tools nearly every time you write a python program.
- The most useful libraries to be familiar with at this point are: [numpy](https://numpy.org/), [pandas](https://pandas.pydata.org/), [matplotlib](https://matplotlib.org/), [scipy](https://scipy.org/) and [seaborn](https://seaborn.pydata.org/). alysis, and web development.


## Code Examples

In [None]:
# Python Libraries and Frameworks in Biochemistry
# ------------------------------------------------

# Importing necessary libraries
import numpy as np  # For numerical operations
import pandas as pd  # For data manipulation
import matplotlib.pyplot as plt  # For plotting
from scipy import stats  # For scientific computing
import seaborn as sns  # For advanced data visualization

# NumPy: Numerical Operations
# ---------------------------
# Example: Creating and manipulating a NumPy array for concentration values
concentrations = np.array([0.5, 1.0, 1.5, 2.0, 2.5])
normalized_concentrations = concentrations / concentrations.max()
print("Normalized Concentrations:", normalized_concentrations)

# pandas: Data Manipulation
# -------------------------
# Example: Creating a DataFrame to store enzyme kinetics data
kinetics_data = {
    'Enzyme': ['Enzyme1', 'Enzyme2', 'Enzyme3'],
    'Vmax': [100, 200, 150],
    'Km': [10, 15, 12]
}
kinetics_df = pd.DataFrame(kinetics_data)
print("Enzyme Kinetics DataFrame:\n", kinetics_df)

# Matplotlib: Basic Plotting
# --------------------------
# Example: Plotting enzyme Vmax values
plt.plot(kinetics_df['Enzyme'], kinetics_df['Vmax'], marker='o')
plt.xlabel('Enzyme')
plt.ylabel('Vmax')
plt.title('Enzyme Vmax Comparison')
plt.show()

# SciPy: Scientific Computing
# ---------------------------
# Example: Performing a t-test on two sets of measurements
group1 = np.array([20, 22, 19, 20, 21])
group2 = np.array([28, 30, 29, 31, 30])
t_stat, p_val = stats.ttest_ind(group1, group2)
print("T-test result: t-statistic =", t_stat, ", p-value =", p_val)

# Seaborn: Advanced Data Visualization
# ------------------------------------
# Example: Creating a more complex visualization - enzyme kinetics heatmap
heatmap_data = kinetics_df.pivot("Enzyme", "Km", "Vmax")
sns.heatmap(heatmap_data, annot=True)
plt.title('Enzyme Kinetics Heatmap')
plt.show()

# Exercises

## Exercise 1: NumPy Array Operations
- **Task**: Create a NumPy array representing the pH values of a solution series. Then, normalize these values to range between 0 and 1.
- **Hint**: Use `np.array()` to create the array and normalize by dividing by the maximum value.

## Exercise 2: pandas DataFrame Manipulation
- **Task**: Create a DataFrame with columns 'Substrate', 'ReactionRate', and 'Temperature'. Add at least 5 rows of data and then filter to show only rows where the temperature is above 37°C.
- **Hint**: Use `pd.DataFrame()` to create the DataFrame and boolean indexing for filtering.

## Exercise 3: Basic Plotting with Matplotlib
- **Task**: Plot the ReactionRate against Substrate from the DataFrame created in Exercise 2.
- **Hint**: Use `plt.plot()` and pass the relevant DataFrame columns.

## Exercise 4: Statistical Analysis with SciPy
- **Task**: Perform a correlation analysis between ReactionRate and Temperature from the DataFrame.
- **Hint**: Use `stats.pearsonr()` from the SciPy library.

## Exercise 5: Advanced Visualization with Seaborn
- **Task**: Create a scatter plot with a regression line between ReactionRate and Temperature using Seaborn.
- **Hint**: Use `sns.regplot()` and pass the relevant DataFrame columns.

---

These exercises aim to deepen your understanding of various Python libraries and frameworks, especially in the context of biochemistry-related data handling and analysis. Good luck!


In [1]:
# Your Answers Here; Create New Cells as Needed

## Solutions

In [None]:
# Importing necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import seaborn as sns

# Exercise 1: NumPy Array Operations
ph_values = np.array([6.8, 7.0, 7.2, 7.4, 7.6])
normalized_ph = ph_values / ph_values.max()
print("Normalized pH Values:", normalized_ph)

# Exercise 2: pandas DataFrame Manipulation
reaction_data = {
    'Substrate': ['Sub1', 'Sub2', 'Sub3', 'Sub4', 'Sub5'],
    'ReactionRate': [50, 60, 55, 65, 70],
    'Temperature': [35, 37, 40, 42, 45]
}
reaction_df = pd.DataFrame(reaction_data)
high_temp_df = reaction_df[reaction_df['Temperature'] > 37]
print("Reactions with Temperature > 37°C:\n", high_temp_df)

# Exercise 3: Basic Plotting with Matplotlib
plt.plot(high_temp_df['Substrate'], high_temp_df['ReactionRate'], marker='o')
plt.xlabel('Substrate')
plt.ylabel('Reaction Rate')
plt.title('Reaction Rate vs. Substrate')
plt.show()

# Exercise 4: Statistical Analysis with SciPy
correlation, p_value = stats.pearsonr(high_temp_df['ReactionRate'], high_temp_df['Temperature'])
print("Correlation between Reaction Rate and Temperature:", correlation)
print("P-value:", p_value)

# Exercise 5: Advanced Visualization with Seaborn
sns.regplot(x='Temperature', y='ReactionRate', data=high_temp_df)
plt.title('Reaction Rate vs. Temperature with Regression Line')
plt.show()
