# Memory mapping in Konduit Serving 

A `MemMapConfig` enables memory mapping capabilities for a Konduit Serving instance. Rather than reading an entire large file into memory, small segments of the file are loaded into memory whenever needed. This can reduce I/O data movement and thus the number of access operations needed to retrieve data. 

A MemMapConfig can be declared as part of an InferenceConfiguration.

In [1]:
from konduit import PythonConfig, PythonStep, ServingConfig, PythonConfig, \
InferenceConfiguration, MemMapConfig
from konduit.client import Client
from konduit.server import Server 
import time 
import numpy as np 
import os 

if not os.path.exists("../data/memmap"):
    os.mkdir("../data/memmap")

In [2]:
array_path = "../data/memmap/array_path.npy"
unk_vector_path = "../data/memmap/unk_vector.npy"
array = np.array([[1, 3, 5, 7], [2, 4, 6, 8]])
unk_vector = [[3, 4, 5, 6]]
np.save(array_path, array)
np.save(unk_vector_path, unk_vector)

In [3]:
python_config = PythonConfig(
    python_code='first += 2',
    python_inputs={'first': 'NDARRAY'},
    python_outputs={'first': 'NDARRAY'},
)
    
python_step = PythonStep().step(python_config)

In [4]:
inference_config = InferenceConfiguration(
    pipeline_steps=[python_step], 
    serving_config=ServingConfig(http_port=9999), 
    mem_map_config=MemMapConfig(
        array_path="../data/memmap/array_path.npy",
        unk_vector_path="../data/memmap/unk_vector.npy"
    )
)

In [5]:
server = Server(inference_config=inference_config)

In [6]:
server.start()
time.sleep(10)
client = Client(url='http://localhost:' + str(9999))
output = client.predict({"default": np.array([4])})
print(output)
server.stop()

[[6]]
