# Smallpond Lightweight Analytics at Scale

In today’s data landscape, teams often face a trade-off between simplicity and scale. Traditional analytics stacks can be heavy, costly, and complex to get started, especially when the goal is to move fast and process big data efficiently. Smallpond bridges that gap by focusing on:

⚡ Speed — Fast setup and minimal overhead for transformation of big data.

📦 Simplicity — Simple interfaces and modular components that just work.

☁️ Scalability — Designed to handle distributed data workloads seamlessly.

🧩 Interoperability — Works with common data formats and integrates smoothly with existing tools like pandas and duckdb.

In this notebook, we’ll explore how Smallpond can be used to build lightweight, high-performance transformation pipelines.

But first we need to setup the environment and infrastructure to really test it's functionality

## Virtual environment creation

### Is very important to create the environment with Python 3.11.11 because this version needs to match Ray's Python compatibility version.
If you don't have this version you can install it with:  ```pyenv install 3.11.11```
1. Create a Virtual Environment
This command creates a new virtual environment in a folder named venv.
```sh
python3 -m venv venv
source venv/bin/activate

```
2. Activate the Virtual Environment
On macOS/Linux:
```sh
source venv/bin/activate
```
On Windows (Command Prompt):
```sh
venv\Scripts\activate
```

You should now see (venv) appear at the start of your terminal prompt — this means your environment is active.

3. Install Dependencies
```sh
pip install -r requirements.txt
```
This installs all the necessary libraries specified in the requirements.txt file.




## Installation of Docker and Docker Compose
### Installation for Windows
https://docs.docker.com/desktop/setup/install/windows-install/
### Installation for MacOS
https://docs.docker.com/desktop/setup/install/mac-install/
### Installation for Linux
https://docs.docker.com/desktop/setup/install/linux/


## Running Jupyter Notebooks
Having the virtual environment activated you should just run:
```sh
 jupyter notebook --port 8889
```
Which will open a new window in the navigator with all the folders of this project.


## Building and deploying and testing the docker-compose infrastructure locally

1. Verify Docker and Docker Compose are available:
```sh
docker --version
docker compose version
```
2. Build your images
From your project root (where docker-compose.yml is located), run:
```sh
docker compose build
```
This command reads your docker-compose.yml, builds any services that use the build: directive, and downloads the necessary base images.

3. Run the docker container
```sh
docker compose up -d
```
This will start all services defined in your docker-compose.yml

To stop all running containers, run:
```sh
docker compose down -v
```

If you make changes to the code or yml you can also re-build with:
```sh
docker compose up -d --build
```




# Your local environment is ready now to start testing! let's go to Smallpond.ipynb to test that everything is working correctly
