# üìö Libraries

## üéØ Overview
### üìù Notes
- A library is a collection of packages or modules that are grouped together to extend functionality üìö
- **Types**:
  - üêç Standard libraries (included with Python)
  - üì¶ Third-party libraries & packages (need to be installed)
  - üîß Typically, installed using package managers like `pip` or `conda`

### ‚ùì What the heck are Packages then?
- A package can include modules and sub-packages üìÅ
- **üì¶ Packages vs üìö Libraries**
  - Python uses packages to organize inside it's libraries, and libraries to provide and extend functionalities.
  - While all libraries can be considered packages (if they are structured that way), not all packages are libraries.

### ‚≠ê Importance
- ‚ö° It helps save time, standardize processes, and add complex functionalities easily.
- üéØ Used for things like data analysis, machine learning, web development, automation, and more.

## üêç Standard Python Libraries
### üìù Notes
- The same as modules, we use the `from` and `import` keywords. üîë
- The `import` gets the module modules
- The `from` is to get specific attributes from a module directly
- **üéØ When to use one or the other**:
  - `import module_name`: When you need to access several functions or attributes from a module
  - `from module_name import attribute_name`: Use when you only need a specific function or attribute from a module

## üí™ Importance of Libraries
### üìù Notes
What if we want to be able to manipulate files? We can use the standard library in Python to open, read, write, and close files. üìÅ

**üîß Functions**:

- `open` üîì: Opens a file. The 'r' mode is for reading, 'w' for writing (overwrites content), 'a' for appending, and 'b' for binary mode.
- `read` üìñ: Reads the content of an opened file. Can also use `readline()` for a single line or `readlines()` for all lines as a list.
- `write` ‚úçÔ∏è: Writes a string to an opened file. If the file is opened in append mode ('a'), the text is added at the end.
- `close` üîí: Closes an opened file, which is important for freeing up system resources.

## üöÄ Examples
**üìå Note**: This example is meant to be done on Google Colab with accesses to the sample_data folder. üåê

To read the contents of california_housing_test.csv and print them:

In [2]:
# Open the file in read mode
file = open('D:/Udemy/Luke/Data Analysis Mastering/Python for Data Analytics/DataSets/california_housing.csv')

# Read the file
content = file.read()

# Close the file
file.close()

# Print the content
print(content)

longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value,ocean_proximity
-122.23,37.88,41.0,880.0,129.0,322.0,126.0,8.3252,452600.0,NEAR BAY
-122.22,37.86,21.0,7099.0,1106.0,2401.0,1138.0,8.3014,358500.0,NEAR BAY
-122.24,37.85,52.0,1467.0,190.0,496.0,177.0,7.2574,352100.0,NEAR BAY
-122.25,37.85,52.0,1274.0,235.0,558.0,219.0,5.6431,341300.0,NEAR BAY
-122.25,37.85,52.0,1627.0,280.0,565.0,259.0,3.8462,342200.0,NEAR BAY
-122.25,37.85,52.0,919.0,213.0,413.0,193.0,4.0368,269700.0,NEAR BAY
-122.25,37.84,52.0,2535.0,489.0,1094.0,514.0,3.6591,299200.0,NEAR BAY
-122.25,37.84,52.0,3104.0,687.0,1157.0,647.0,3.12,241400.0,NEAR BAY
-122.26,37.84,42.0,2555.0,665.0,1206.0,595.0,2.0804,226700.0,NEAR BAY
-122.25,37.84,52.0,3549.0,707.0,1551.0,714.0,3.6912,261100.0,NEAR BAY
-122.26,37.85,52.0,2202.0,434.0,910.0,402.0,3.2031,281500.0,NEAR BAY
-122.26,37.85,52.0,3503.0,752.0,1504.0,734.0,3.2705,241800.0,NEAR BAY
-122.26,37.85,52.0,2491.0,474.0,

In [3]:
import csv

data_dict = {}
for index, row in enumerate(csv.reader(content.strip().split('\n'))):
    if index == 0:
        # Initialize dictionary with column headers as keys
        for column in row:
            data_dict[column] = []
    else:
        # Append each element in the row to the correct list in the dictionary
        for col_index, value in enumerate(row):
            data_dict[list(data_dict.keys())[col_index]].append(value)

# Print the dictionary to verify contents
print(data_dict)

{'longitude': ['-122.23', '-122.22', '-122.24', '-122.25', '-122.25', '-122.25', '-122.25', '-122.25', '-122.26', '-122.25', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.27', '-122.27', '-122.26', '-122.27', '-122.27', '-122.27', '-122.27', '-122.27', '-122.27', '-122.28', '-122.28', '-122.28', '-122.28', '-122.28', '-122.28', '-122.28', '-122.27', '-122.27', '-122.27', '-122.27', '-122.27', '-122.28', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.26', '-122.27', '-122.26', '-122.27', '-122.27', '-122.27', '-122.27', '-122.27', '-122.28', '-122.28', '-122.28', '-122.28', '-122.28', '-122.29', '-122.29', '-122.29', '-122.29', '-122.3', '-122.3', '-122.3', '-122.3', '-122.29', '-122.3', '-122.29', '-122.29', '-122.29', '-122.29', '-122.29', '-122.29', '-122.28', '-122.28', '-122.28', '-122.29', '-122.28', '-122.28', '-122.27', '-122.28', '-122.28', '-122.28', '-122.28', '-122.27', '-122.27', '-122.27', '-122.27', 

# Working with CSV Files Made Easy

Okay, that's honestly a lot to do each time you want to work with CSV files. But we could use a third-party library called **pandas** instead to load files much more easily.

## Third-Party Libraries

### Example with Pandas

With a library like pandas, you can read the file and convert it in just **3 lines of code!**


In [5]:
import pandas as pd

# Create a dataframe from csv file
df = pd.read_csv('D:/Udemy/Luke/Data Analysis Mastering/Python for Data Analytics/DataSets/california_housing.csv')

# Print the dataframe
df

# get sume of total_bedrooms
sum_total_bedrooms = df['total_bedrooms'].sum()

# üì¶ Package Management & Third-Party Libraries

## üìù Notes

These are third-party packages and libraries (not included in Python Standard Library) that need to be installed separately. üîß

The way you install a package depends on the package manager you are using. There are two main ways to do this:

- **üêç Pip** - If you use pip for package management (Google Colab uses this)
- **üìä Conda** - If you use conda or mamba for package management (We'll use this in the Advanced Chapter)

> **üí° NOTE**: We'll go more into package management in the advanced chapter.

## üéØ Common Third-Party Libraries

Below are some common third-party libraries used in data science:

- **üêº Pandas**: Offers data structures and operations for manipulating numerical tables and time series
- **üî¢ NumPy**: Supports large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions
- **üìä Matplotlib**: A plotting library for creating static, animated, and interactive visualizations in Python
- **üé® Seaborn**: Provides a high-level interface for drawing attractive and informative statistical graphics
- **üî¨ SciPy**: Used for scientific and technical computing, offering modules for optimization, linear algebra, and more
- **ü§ñ Scikit-learn**: Implements a range of machine learning algorithms for data mining, data analysis, and machine learning tasks
- **üìà Plotly**: Creates interactive and visually appealing graphs for web publication

## üåê Where to Find Packages

- **üìö PyPi** - for pip packages
- **üêç Anaconda** - for conda packages

## üìã Listing Packages Installed

If you're running this in Google Colab, use the following commands:

In [6]:
!pip list

Package                   Version
------------------------- -----------
adjustText                1.3.0
aiohappyeyeballs          2.6.1
aiohttp                   3.13.0
aiosignal                 1.4.0
anyio                     4.10.0
argon2-cffi               21.3.0
argon2-cffi-bindings      25.1.0
asttokens                 3.0.0
async-lru                 2.0.4
attrs                     24.3.0
babel                     2.16.0
beautifulsoup4            4.13.5
bleach                    6.2.0
brotlicffi                1.0.9.2
certifi                   2025.10.5
cffi                      2.0.0
charset-normalizer        3.3.2
colorama                  0.4.6
comm                      0.2.1
contourpy                 1.3.3
cycler                    0.12.1
datasets                  4.2.0
debugpy                   1.8.16
decorator                 5.2.1
defusedxml                0.7.1
dill                      0.4.0
et_xmlfile                2.0.0
executing                 2.2.1
fastjsonschema   

If you're running this locally using Conda run this:



In [7]:
!conda list

# packages in environment at D:\anaconda3\envs\haroun_env:
#
# Name                    Version                   Build  Channel
adjusttext                1.3.0                    pypi_0    pypi
aiohappyeyeballs          2.6.1                    pypi_0    pypi
aiohttp                   3.13.0                   pypi_0    pypi
aiosignal                 1.4.0                    pypi_0    pypi
anyio                     4.11.0                   pypi_0    pypi
argon2-cffi               21.3.0             pyhd3eb1b0_0  
argon2-cffi-bindings      25.1.0          py312h02ab6af_0  
asttokens                 3.0.0           py312haa95532_0  
async-lru                 2.0.4           py312haa95532_0  
attrs                     25.4.0                   pypi_0    pypi
babel                     2.16.0          py312haa95532_0  
beautifulsoup4            4.13.5          py312haa95532_0  
bleach                    6.2.0           py312haa95532_0  
brotlicffi                1.0.9.2         py312h5da7b33_

Note: !pip list will work if running conda; BUT it won't include all your packages.

# Installing Packages

## Notes

Once again, depending on your package manager, you'll need to use either `pip` or `conda` to install packages.

## Examples

### Pip Install (Google Colab example)

Google Colab comes standard with a bunch of libraries installed; but here's one we don't have:

```bash
pip install package_name

In [8]:
!pip install pyjokes



conda install (Local Example)

**Note: This is an example ONLY if you are running locally and have Anaconda installed (on Colab, pandas is already installed).**

Since pandas is a library outside of Python's standard library; so we can install with conda.

In [None]:
!conda install pandas

# Import the Package

Now that we've installed a library we need to import it. This lets us use it in our specific notebook / environment (we'll get more into environments later in the advanced section).

We will be showing how to import Python libraries, packages and modules. Here's a reminder of the difference between all 3:

- **Library**: A collection of packages and modules
- **Package**: A directory with Python scripts and an `__init__.py` file
- **Module**: A Python script file that can be imported

## Examples

In [None]:
import pyjokes

pyjokes.get_joke()