Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,17 +1,22 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
"name": "Python 3",
// Container definition for a Python 3.11 development environment
"name": "Python 3.11",
"image": "mcr.microsoft.com/devcontainers/python:0-3.11",
"onCreateCommand": "sudo apt update && sudo apt upgrade -y && pip3 install --upgrade pip && pip3 install --user -r requirements.txt",

// Custom configuration options
"customizations": {
"vscode": {

// Use 'settings' to set default VS code values on container create
"settings": {
"jupyter.kernels.excludePythonEnvironments": ["/usr/bin/python3"],
"remote.portsAttributes": {
"ipykernel_launcher": {"onAutoForward": "ignore"}
}
"remote.autoForwardPorts": false,
"remote.restoreForwardedPorts": false
},

// Add the IDs of VS code extensions you want to install here
"extensions": [
"-dbaeumer.vscode-eslint",
"ms-python.python",
Expand All @@ -20,5 +25,10 @@
]
}
},

// Use 'onCreateCommand' to run commands once when the container is created
"onCreateCommand": "sudo apt update && sudo apt upgrade -y && pip3 install --upgrade pip && pip3 install --user -r requirements.txt",

// Use 'postAttachCommand' to run commands each time a user connects to the container
"postAttachCommand": "htop"
}
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
.vscode
__pycache__
.ipynb_checkpoints
.vscode
.venv
170 changes: 116 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,69 +1,131 @@
# Algorithm Optimization Project
# Algorithm Optimization Project - Machine Learning

![Preview](assets/preview.png)
[![Codespaces Prebuilds](https://github.com/4GeeksAcademy/gperdrizet-algorithm-optimization-project-machine-learning/actions/workflows/codespaces/create_codespaces_prebuilds/badge.svg)](https://github.com/4GeeksAcademy/gperdrizet-algorithm-optimization-project-machine-learning/actions/workflows/codespaces/create_codespaces_prebuilds)

This repository contains exercises designed to help you practice optimizing Python algorithms for better performance and readability.
A comprehensive programming optimization project focused on improving algorithm efficiency and code performance. This project demonstrates essential optimization techniques through practical exercises involving text processing and list operations.

## What You'll Learn
![Project Preview](assets/preview.png)

- Text processing optimization techniques
- Efficient list operations and filtering
- Using Python's built-in functions and data structures
- Code modularity and best practices
- Performance analysis and improvement strategies

## Assignment Overview
## Project Overview

The `problems.ipynb` notebook contains two main exercises:
This project focuses on algorithm optimization through two main exercises that teach fundamental performance improvement techniques:

1. **Text Processing Optimization** - Improve code that processes text by converting to lowercase, removing punctuation, counting word frequencies, and finding the most common words.
**Exercise 1: Text Processing Optimization**
- Convert text to lowercase
- Remove punctuation marks efficiently
- Count word frequencies
- Extract most common words

2. **List Processing Optimization** - Enhance code that filters even numbers, duplicates values, sums results, and checks for prime numbers.
**Exercise 2: List Processing Optimization**
- Filter even numbers from lists
- Duplicate list elements
- Sum numerical values
- Prime number detection

The project provides hands-on experience with:
- Code refactoring and optimization
- Efficient data structure usage
- Python built-in function utilization
- Modular programming practices
- Performance analysis and improvement

Each exercise includes working but inefficient code that you'll optimize using better algorithms, data structures, and Python idioms.

## Getting Started

### Option 1: GitHub Codespaces (Recommended)
1. Fork this repository to your GitHub account
2. Click the green "Code" button on your forked repository
3. Select "Codespaces" tab
4. Click "Create codespace on main"
5. Wait for the environment to load (this may take a few minutes)
6. Open `problems.ipynb` and start working!

1. **Fork the Repository**
- Click the "Fork" button on the top right of the GitHub repository page
- 4Geeks students: set 4GeeksAcademy as the owner - 4Geeks pays for your codespace usage. All others, set yourself as the owner
- Give the fork a descriptive name. 4Geeks students: I recommend including your GitHub username to help in finding the fork if you loose the link
- Click "Create fork"
- 4Geeks students: bookmark or otherwise save the link to your fork

2. **Create a GitHub Codespace**
- On your forked repository, click the "Code" button
- Select "Create codespace on main"
- If the "Create codespace on main" option is grayed out - go to your codespaces list from the three-bar menu at the upper left and delete an old codespace
- Wait for the environment to load (dependencies are pre-installed)

3. **Start Working**
- Open `notebooks/assignment.ipynb` in the Jupyter interface
- Follow the step-by-step instructions in the notebook

### Option 2: Local Development
1. Fork and clone this repository
2. Create a virtual environment: `python -m venv venv`
3. Activate the virtual environment:
- On Windows: `venv\Scripts\activate`
- On macOS/Linux: `source venv/bin/activate`
4. Install Jupyter: `pip install jupyter`
5. Install dependencies: `pip install -r requirements.txt`
6. Launch Jupyter: `jupyter notebook`
7. Open `problems.ipynb`

## Working with the Notebook

- Each exercise contains the original inefficient code followed by optimization points
- Review the provided solutions as reference implementations
- Try implementing your own optimizations before checking the solutions
- Run each cell to test your code and compare performance

## Learning Goals

By completing this assignment, you will:
- Understand common performance bottlenecks in Python code
- Learn to use appropriate data structures for different problems
- Practice writing clean, modular, and efficient code
- Gain experience with Python's built-in optimization tools

## Assessment

Focus on:
- **Correctness**: Does your optimized code produce the same results?
- **Efficiency**: Is your solution faster and more memory-efficient?
- **Readability**: Is your code clean and well-structured?
- **Best Practices**: Are you using appropriate Python idioms?

Happy coding!

1. **Prerequisites**
- Git
- Python >= 3.10

2. **Fork the repository**
- Click the "Fork" button on the top right of the GitHub repository page
- Optional: give the fork a new name and/or description
- Click "Create fork"

3. **Clone the repository**
- From your fork of the repository, click the green "Code" button at the upper right
- From the "Local" tab, select HTTPS and copy the link
- Run the following commands on your machine, replacing `<LINK>` and `<REPO_NAME>`

```bash
git clone <LINK>
cd <REPO_NAME>
```

4. **Set Up Environment**

```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

5. **Launch Jupyter & start the notebook**
```bash
jupyter notebook notebooks/assignment.ipynb
```


## Project Structure

```
├── .devcontainer/ # Development container configuration
├── assets/ # Files and resources directory
├── notebooks/ # Jupyter notebook directory
│ ├── assignment.ipynb # Assignment notebook with exercises
│ └── solution.ipynb # Solution notebook with optimized code
├── .gitignore # Files/directories not tracked by git
├── requirements.txt # Python dependencies
└── README.md # Project documentation
```


## Learning Objectives

1. **Algorithm Analysis**: Identify performance bottlenecks in existing code
2. **Data Structure Optimization**: Use appropriate Python data structures for efficiency
3. **Built-in Functions**: Leverage Python's optimized built-in functions
4. **List Comprehensions**: Replace loops with more efficient comprehensions
5. **Modular Design**: Break code into focused, reusable functions
6. **Performance Comparison**: Understand the impact of different approaches

## Technologies Used

- **Python 3.11**: Core programming language
- **Collections**: Counter for efficient frequency counting
- **String**: Built-in string processing utilities
- **Math**: Mathematical operations and functions
- **Jupyter**: Interactive development environment


## Contributing

This is an educational project. Contributions for improving the optimization examples or adding new exercises are welcome:

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
Loading