# Getting Started 
with the NDN Distributed Processing Engine

Note: you may want to run this in a venv or Conda environment.

### Get Dependencies

In [1]:
# Get submodules
!git submodule update --init --recursive

Note: Installing local packages should automatically install pip dependencies such as `python-ndn`. 

In [2]:
# Install packages
!cd ..; for dir in ./pkg/*/; do [ -d "$dir" ] && pip install --find-links=./pkg -e "$dir"; done

### Generate Data

In [3]:
from ndn_compute_jsonl_generator import generate_large_jsonl
from ndn_compute_fs_creator import create_fs_from_directory

In [4]:
# Generate flat files
!mkdir -p ../generated_data/flat/appA
!mkdir -p ../generated_data/flat/appB

generate_large_jsonl(filename='../generated_data/flat/appA/events.log.jsonl', target_size_mb=200)
generate_large_jsonl(filename='../generated_data/flat/appB/events.log.jsonl', target_size_mb=500)

Generating JSONL file of approximately 200MB...
Progress: 22.76% complete
Records written: 100,000
Current file size: 45.51MB
Progress: 45.51% complete
Records written: 200,000
Current file size: 91.03MB
Progress: 68.27% complete
Records written: 300,000
Current file size: 136.55MB
Progress: 91.03% complete
Records written: 400,000
Current file size: 182.06MB

File generation complete!
Final file size: 200.00MB
Total records written: 439,417
Generating JSONL file of approximately 500MB...
Progress: 9.10% complete
Records written: 100,000
Current file size: 45.51MB
Progress: 18.21% complete
Records written: 200,000
Current file size: 91.03MB
Progress: 27.31% complete
Records written: 300,000
Current file size: 136.55MB
Progress: 36.41% complete
Records written: 400,000
Current file size: 182.06MB
Progress: 45.51% complete
Records written: 500,000
Current file size: 227.57MB
Progress: 54.62% complete
Records written: 600,000
Current file size: 273.09MB
Progress: 63.72% complete
Records w

In [5]:
# Distribute files into a toy distributed filesystem

!mkdir -p ../generated_data/distributed
create_fs_from_directory(in_dir="../generated_data/flat",
                         out_dir="../generated_data/distributed",
                         num_partitions=2,
                         num_copies=1,
                         chunk_size=64
                         )

../generated_data/flat/appB/events.log.jsonl
../generated_data/flat/appA/events.log.jsonl


### Starting the cluster

Please run `docker-compose up` in another terminal (from the `ndn-compute` repository root) so that you can see the stdout output in the foreground.

In [6]:
# !docker-compose up

# Make sure your cluster is running
!docker-compose ps

NAME      IMAGE                 COMMAND                  SERVICE   CREATED         STATUS         PORTS
driver1   ndn-compute-driver    "python -m ndn_compu…"   driver    4 seconds ago   Up 3 seconds   0.0.0.0:5214->5214/tcp
nfd1      ndn-compute-nfd       "/usr/bin/nfd --conf…"   nfd       3 days ago      Up 3 seconds   6363/tcp, 9696/tcp, 6363/udp
worker1   ndn-compute-worker1   "python -m ndn_compu…"   worker1   4 seconds ago   Up 3 seconds   
worker2   ndn-compute-worker2   "python -m ndn_compu…"   worker2   4 seconds ago   Up 3 seconds   


IMPORTANT: You should see a driver, NFD, and worker(s) up

### Using the engine

In [7]:
from ndn_compute_client import NdnComputeClient

In [8]:
client = NdnComputeClient('http://localhost:5214')

In [9]:
client.add(8, 9)

17

In [10]:
# TODO: write the code then show people how to actually process data