## Overview

Python offers several ways to import modules and their contents. Each method has specific use cases, advantages, and potential pitfalls. In this notebook, we'll explore:

1. **Standard import** (`import module`)
2. **Selective import** (`from module import name`)
3. **Wildcard import** (`from module import *`)
4. **Aliasing** (`import module as alias`)

Understanding these strategies is crucial for writing clean, maintainable code, especially in data analysis where you'll work with many libraries.

## Strategy 1: Standard Import

### Syntax

```python
import module_name
```

This imports the entire module, and you access its contents using dot notation.

In [None]:
import math

# Must use module.function syntax
result = math.sqrt(25)
pi_value = math.pi

print(f"Square root: {result}")
print(f"Pi value: {pi_value}")

### Advantages

- **Clear code origin**: Always know where a function comes from
- **No naming conflicts**: Module namespace is separate from your code
- **Best practice**: Recommended for most situations

### Disadvantages

- **More typing**: Need to prefix everything with module name
- **Can be verbose**: Especially with long module names

### Example: Data Analysis with Standard Import

In [None]:
import statistics
import math

# Sales data for a week
sales = [1250, 1340, 980, 1450, 1290, 1380, 1420]

# Using statistics module
mean_sales = statistics.mean(sales)
median_sales = statistics.median(sales)
std_dev = statistics.stdev(sales)

print(f"Mean sales: ${mean_sales:.2f}")
print(f"Median sales: ${median_sales:.2f}")
print(f"Standard deviation: ${std_dev:.2f}")

# Calculate coefficient of variation using both modules
cv = (std_dev / mean_sales) * 100
print(f"\nCoefficient of variation: {cv:.2f}%")

## Strategy 2: Selective Import

### Syntax

```python
from module_name import function_name
from module_name import function1, function2, constant
```

This imports specific entities directly into your namespace.

In [None]:
from math import sqrt, pi, e

# No need for 'math.' prefix
result = sqrt(25)
print(f"Square root: {result}")
print(f"Pi: {pi}")
print(f"Euler's number: {e}")

### Advantages

- **Less typing**: Direct access without module prefix
- **Cleaner code**: When using only a few functions
- **Explicit imports**: Shows exactly what you're using

### Disadvantages

- **Potential naming conflicts**: Imported names can shadow existing ones
- **Less clear origin**: Harder to trace where functions come from in large codebases

### Understanding Namespace Pollution

In [None]:
# Example of naming conflict
from math import pi

print(f"Math pi: {pi}")

# Redefining pi (this supersedes the imported value)
pi = 3.14
print(f"Our pi: {pi}")

# The math.pi value is now inaccessible!

In [None]:
# Better: Check what happens with import order
pi = 3.14
print(f"Our pi before import: {pi}")

from math import pi
print(f"After import, pi is: {pi}")

# The import supersedes our definition!

### Example: Data Analysis with Selective Import

In [None]:
from statistics import mean, median, stdev
from math import sqrt

# Test scores dataset
scores = [78, 85, 92, 88, 76, 95, 89, 84, 91, 87]

# Calculate statistics without module prefixes
avg_score = mean(scores)
mid_score = median(scores)
score_stdev = stdev(scores)

print(f"Average score: {avg_score:.2f}")
print(f"Median score: {mid_score:.2f}")
print(f"Standard deviation: {score_stdev:.2f}")

# Calculate z-score for a student who got 95
student_score = 95
z_score = (student_score - avg_score) / score_stdev
print(f"\nZ-score for {student_score}: {z_score:.2f}")

## Strategy 3: Wildcard Import (Use with Caution!)

### Syntax

```python
from module_name import *
```

This imports **all** public entities from a module.

In [None]:
from math import *

# All math functions are now available
print(sqrt(16))
print(sin(pi / 2))
print(log(e))
print(factorial(5))

### Why This Is Problematic

In [None]:
# Danger: You don't know what's being imported
from math import *

# How many items were imported?
math_items = [name for name in dir() if not name.startswith('_')]
print(f"Items imported from math: {len([n for n in math_items if n in dir(__builtins__) or True])}")
print(f"\nSome examples: {math_items[:10]}")

### Real Problem: Silent Conflicts

In [None]:
# Define a custom function
def gcd(a, b):
    """My custom GCD that always returns -1 (buggy!)"""
    return -1

print(f"My GCD(10, 5): {gcd(10, 5)}")

# Now import everything from math
from math import *

# My function is silently replaced!
print(f"After wildcard import, GCD(10, 5): {gcd(10, 5)}")

### When Wildcard Import Is Acceptable

- **Interactive sessions**: Quick testing in REPL or Jupyter
- **Well-known modules**: Modules specifically designed for it (rare)
- **Temporary code**: Prototyping that will be refactored

**Never use in production code!**

## Strategy 4: Aliasing with 'as'

### Module Aliasing

```python
import module_name as alias
```

Give a module a shorter or more meaningful name.

In [None]:
import statistics as stats

data = [10, 20, 30, 40, 50]
print(f"Mean: {stats.mean(data)}")
print(f"Median: {stats.median(data)}")

### Common Aliases in Data Analysis

In [None]:
# These are STANDARD conventions - always use these!
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Create a simple numpy array
arr = np.array([1, 2, 3, 4, 5])
print(f"NumPy array: {arr}")
print(f"Mean: {np.mean(arr)}")

# Create a simple pandas Series
series = pd.Series([10, 20, 30, 40, 50])
print(f"\nPandas Series:\n{series}")

### Function-Level Aliasing

In [None]:
from statistics import mean as average, stdev as std

values = [15, 20, 25, 30, 35]
print(f"Average: {average(values)}")
print(f"Standard deviation: {std(values)}")

### Example: Resolving Name Conflicts with Aliases

In [None]:
# Both numpy and random have a 'random' function
import numpy as np
import random

# Generate random numbers using both
python_random = random.random()
numpy_random = np.random.random()

print(f"Python random: {python_random}")
print(f"NumPy random: {numpy_random}")

# Alternative: alias specific functions
from random import random as py_random
from numpy.random import random as np_random

print(f"\nUsing aliases:")
print(f"Python: {py_random()}")
print(f"NumPy: {np_random()}")

## Comparison: All Strategies Together

In [None]:
# Dataset for comparison
dataset = [23, 45, 67, 89, 12, 34, 56, 78, 90, 21]

print("=" * 50)
print("STRATEGY 1: Standard Import")
print("=" * 50)

import statistics
result1 = statistics.mean(dataset)
print(f"statistics.mean(dataset) = {result1}")
print("Pro: Clear origin, no conflicts")
print("Con: More typing\n")

print("=" * 50)
print("STRATEGY 2: Selective Import")
print("=" * 50)

from statistics import mean
result2 = mean(dataset)
print(f"mean(dataset) = {result2}")
print("Pro: Less typing, explicit")
print("Con: Possible name conflicts\n")

print("=" * 50)
print("STRATEGY 3: Wildcard Import (NOT RECOMMENDED)")
print("=" * 50)

from statistics import *
result3 = mean(dataset)
print(f"mean(dataset) = {result3}")
print("Pro: All functions available")
print("Con: Namespace pollution, unclear imports\n")

print("=" * 50)
print("STRATEGY 4: Aliasing")
print("=" * 50)

import statistics as stats
result4 = stats.mean(dataset)
print(f"stats.mean(dataset) = {result4}")
print("Pro: Shorter code, clear origin")
print("Con: Must remember aliases\n")

## Best Practices for Data Analysis

### 1. Use Standard Aliases for Common Libraries

In [None]:
# ALWAYS use these standard aliases
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Everyone in the data science community recognizes these

### 2. Group Imports by Type

In [None]:
# Standard library imports
import os
import sys
import math

# Third-party imports
import numpy as np
import pandas as pd

# Local imports (your own modules)
# from my_module import my_function

### 3. Be Explicit in Production Code

In [None]:
# Good: Clear what's being used
from statistics import mean, median, stdev

# Bad: Unclear what's available
# from statistics import *

### 4. Consider Readability vs. Brevity

In [None]:
import numpy as np

# This is fine for short scripts
from numpy import array, mean, std

# But this is better for larger projects (clear origin)
arr = np.array([1, 2, 3])
avg = np.mean(arr)
deviation = np.std(arr)

## Real-World Example: Data Analysis Pipeline

In [None]:
# Standard imports for a data analysis project
import numpy as np
import pandas as pd
from statistics import mean, median
import math

# Simulated sales data
sales_data = {
    'product': ['A', 'B', 'C', 'D', 'E'],
    'units_sold': [120, 85, 200, 150, 95],
    'price': [29.99, 49.99, 19.99, 39.99, 59.99]
}

# Create DataFrame (pandas)
df = pd.DataFrame(sales_data)

# Calculate revenue (pandas)
df['revenue'] = df['units_sold'] * df['price']

# Convert to numpy for statistical calculations
revenues = df['revenue'].values

# Use different modules appropriately
print("Sales Analysis")
print("=" * 40)
print(f"\nDataFrame (pandas):")
print(df)

print(f"\nStatistics (using statistics module):")
print(f"Mean revenue: ${mean(revenues):.2f}")
print(f"Median revenue: ${median(revenues):.2f}")

print(f"\nNumPy calculations:")
print(f"Total revenue: ${np.sum(revenues):.2f}")
print(f"Std deviation: ${np.std(revenues):.2f}")

print(f"\nMath module:")
max_revenue = max(revenues)
log_max = math.log10(max_revenue)
print(f"Log10 of max revenue: {log_max:.2f}")

## Exercises

### Exercise 1: Import Strategy Decision

For each scenario, decide which import strategy is most appropriate and explain why:

1. You need to use `sqrt`, `sin`, `cos`, and `pi` from the `math` module extensively throughout your code
2. You're writing a quick prototype and need many functions from `statistics`
3. You're building a production data analysis library
4. You only need the `mean` function from `statistics`

In [None]:
# Write your answers and code here

# Scenario 1:

# Scenario 2:

# Scenario 3:

# Scenario 4:

### Exercise 2: Fixing Namespace Conflicts

The following code has namespace conflicts. Fix it using appropriate import strategies:

In [None]:
# Problematic code
from math import *
from statistics import *

# Both modules have a 'variance' concept, but different implementations
data = [1, 2, 3, 4, 5]

# Which variance are we using? This is unclear!
result = variance(data)  # This will cause an error or confusion

# Fix the code here:

### Exercise 3: Real-World Scenario

You're analyzing temperature data. Write code that:

1. Imports necessary modules using best practices
2. Creates a dataset of temperatures: `[22.5, 24.1, 23.8, 25.2, 24.7, 23.5, 22.9, 24.3, 23.7, 24.0]`
3. Uses `statistics` module to calculate mean and stdev
4. Uses `math` module to calculate the ceiling and floor of the mean
5. Uses `numpy` (if available) to calculate percentiles (25th, 50th, 75th)

Use appropriate import strategies for each module.

In [None]:
# Your solution here

### Exercise 4: Understanding Import Behavior

Predict the output of this code before running it. Then explain what happened:

In [None]:
# Code block 1
pi = 3.14159
print(f"Initial pi: {pi}")

from math import pi
print(f"After import: {pi}")

pi = 3.14
print(f"After reassignment: {pi}")

# Can we access the original math.pi value now?
# Try to figure out how

## Key Takeaways

| Strategy | Syntax | Use When | Avoid When |
|----------|--------|----------|------------|
| **Standard Import** | `import module` | Most situations, production code | Module name is very long |
| **Selective Import** | `from module import name` | Need few specific items | Importing many items, risk of conflicts |
| **Wildcard Import** | `from module import *` | Quick prototyping, interactive use | Production code, any serious project |
| **Aliasing** | `import module as alias` | Standard conventions (numpy, pandas), long names | No clear standard exists |

**Golden Rules:**
1. Use standard aliases for common data science libraries
2. Avoid wildcard imports in production code
3. Be explicit about what you import
4. Consider readability for future maintainers
5. Group imports logically (standard library, third-party, local)

## What's Next?

In the next notebook, we'll explore specific modules from Python's **Standard Library** that are essential for data analysis: `math`, `random`, `statistics`, `datetime`, and `platform`.