# A Primer on Python Data Types

Python offers several built-in data types that are useful for different purposes. Here, we compare and contrast some of the most commonly used data types: lists, tuples, dictionaries, and NumPy arrays.

## Lists
- **Mutable**: Lists can be modified after creation (add, remove, or change items).
- **Ordered**: The order of items is maintained, and items can be accessed by their position.
- **Syntax**: Created using square brackets `[]`.
- **Use Case**: Ideal for collections of items where the order matters and contents might change.

## Tuples
- **Immutable**: Once a tuple is created, it cannot be modified.
- **Ordered**: Like lists, tuples maintain the order of items.
- **Syntax**: Created using parentheses `()`.
- **Use Case**: Suitable for fixed data sets, like coordinates or RGB color values.

## Dictionaries
- **Mutable**: Can change, add, or delete key-value pairs.
- **Unordered**: Items are not stored in a specific order and are accessed via keys.
- **Syntax**: Created using curly braces `{}` with key-value pairs.
- **Use Case**: Perfect for associating keys with values, like mapping names to phone numbers.

## NumPy Arrays
- **Mutable**: Elements can be modified, but the array's size is fixed.
- **Ordered**: Elements are stored in a specific order.
- **Syntax**: Created using `numpy.array()` function.
- **Use Case**: Ideal for numerical operations, especially in scientific computing, due to its efficiency and the availability of vectorized operations.

## Key Differences
- **Mutability**: Lists and dictionaries are mutable, while tuples are immutable. NumPy arrays are mutable but have a fixed size.
- **Ordering**: Lists, tuples, and NumPy arrays are ordered, meaning the order of elements is preserved. Dictionaries are unordered until Python 3.7, after which they are ordered.
- **Performance**: NumPy arrays provide better performance for numerical operations compared to lists due to optimized C code under the hood.
- **Functionality**: Dictionaries offer a unique key-value pair structure, making them suitable for different use cases 
T lists, tuples, and ar the choice of data type in Python largely depends on the specific requirements of the application, such as whether you need ordered/unordered data, mutable/immutable structures, orefficient numerical computations.


## Code Examples

In [None]:
# Importing necessary libraries
import numpy as np

# Python Data Types in Biochemistry
# ----------------------------------

# Lists: Dynamic arrays, useful for storing collections of items.
# --------------------------------------------------------------
# Creating a list of common enzymes
enzymes = ["Ligase", "Helicase", "Polymerase", "Nuclease"]
print("List of Enzymes:", enzymes)

# Adding an enzyme to the list
enzymes.append("Transferase")
print("Updated List of Enzymes:", enzymes)

# Accessing a specific enzyme by index
print("Second Enzyme in the List:", enzymes[1])

# Tuples: Immutable sequences, useful for fixed data sets.
# ---------------------------------------------------------
# Defining a tuple of nucleotide bases
nucleotides = ("Adenine", "Thymine", "Cytosine", "Guanine")
print("Nucleotide Bases:", nucleotides)

# Accessing elements in a tuple
print("First Nucleotide Base:", nucleotides[0]) # Notice that Python starts counting from 0..

# Tuples are immutable, so you can't change them after creation
# This is useful for data that shouldn't be modified

# Dictionaries: Key-value pairs, great for mapping relationships.
# ----------------------------------------------------------------
# Creating a dictionary to map enzymes to their functions
enzyme_functions = {
    "Ligase": "Joining of DNA strands",
    "Helicase": "Unwinding DNA helix",
    "Polymerase": "Polymerizing nucleotides",
    "Nuclease": "Cutting DNA strands"
}
print("Enzyme Functions:", enzyme_functions)

# Accessing a function by enzyme name
print("Function of Helicase:", enzyme_functions["Helicase"])

# Adding a new key-value pair
enzyme_functions["Transferase"] = "Transfer of functional groups"
print("Updated Enzyme Functions:", enzyme_functions)

# NumPy Arrays: Efficient arrays for numerical data.
# --------------------------------------------------
# Creating a NumPy array of pH values
ph_values = np.array([7.2, 7.4, 6.8, 7.0, 7.3])
print("pH Values:", ph_values)

# Performing calculations on the entire array
average_ph = np.mean(ph_values)
print("Average pH:", average_ph)

# Conclusion
# ----------
# This code block demonstrates the use of different data types in Python,
# such as lists, tuples, dictionaries, and NumPy arrays. Each data type serves a specific
# purpose and can be used to efficiently store and manipulate data relevant
# to biochemistry applications.

# Python Data Types Exercises

## Exercise 1: List Manipulation
- **Task**: Create a list of five different proteins found in the human body. Then, write a function to add a new protein to the list and print the updated list.
- **Hint**: Use the `append()` method to add items to a list.

## Exercise 2: Tuple Operations
- **Task**: Define a tuple containing the names of four different vitamins. Write a loop to iterate through the tuple and print each vitamin name.
- **Hint**: Remember, tuples are immutable, so you cannot modify them after creation.

## Exercise 3: Dictionary Handling
- **Task**: Create a dictionary mapping three amino acids to their respective molecular weights. Then, write a function that takes an amino acid name as input and returns its molecular weight.
- **Hint**: Use the amino acids 'Alanine', 'Cysteine', and 'Aspartic Acid' with arbitrary weights.

## Exercise 4: NumPy Array Calculations
- **Task**: Create a NumPy array of ten random enzyme activity values. Calculate and print the mean and standard deviation of these values.
- **Hint**: Use `np.random.rand()` to create random values and `np.mean()`, `np.std()` for calculations.

## Exercise 5: Advanced Data Structure Challenge
- **Task**: Combine the concepts of lists, tuples, and dictionaries. Create a dictionary where each key is an enzyme, and its value is a tuple containing the enzyme's pH optimum and a list of subst Include 3 enzymes in your data structure.lated tasks. Good luck!


In [12]:
# Your answers here; Create new cells as needed    

## Solutions

In [None]:
# Import necessary libraries
import numpy as np

# Exercise 1: List Manipulation
def add_protein(proteins, new_protein):
    proteins.append(new_protein)
    return proteins

proteins = ["Hemoglobin", "Insulin", "Keratin", "Collagen", "Myosin"]
new_protein = "Actin"
updated_proteins = add_protein(proteins, new_protein)
print("Updated Protein List:", updated_proteins)

# Exercise 2: Tuple Operations
vitamins = ("Vitamin A", "Vitamin B", "Vitamin C", "Vitamin D")
for vitamin in vitamins:
    print("Vitamin:", vitamin)

# Exercise 3: Dictionary Handling
amino_acid_weights = {
    "Alanine": 89.1,
    "Cysteine": 121.2,
    "Aspartic Acid": 133.1
}

def get_molecular_weight(amino_acid):
    return amino_acid_weights.get(amino_acid, "Unknown")

print("Molecular Weight of Alanine:", get_molecular_weight("Alanine"))

# Exercise 4: NumPy Array Calculations
enzyme_activities = np.random.rand(10)
print("Enzyme Activities:", enzyme_activities)
print("Mean Activity:", np.mean(enzyme_activities))
print("Standard Deviation:", np.std(enzyme_activities))

# Exercise 5: Advanced Data Structure Challenge
enzymes = {
    "Amylase": (6.8, ["Starch", "Glycogen"]),
    "Lipase": (7.0, ["Triglycerides"]),
    "Protease": (6.5, ["Proteins"])
}

for enzyme, (pH, substrates) in enzymes.items():
    print(f"Enzyme: {enzyme}, pH Optimum: {pH}, Substrates: {substrates}")

