# ðŸš€ Getting Started with **Neuralk API**

Follow these four simple steps to create your account and start using our powerful prediction model **NICL**.

---

### **Step 1 â€” Get Your Access Code**

Your Neuralk contact has given you an **access code**.  
If not, simply **request one from your Neuralk contact** â€” it grants you **free credits**, to have access to free predictions with our in-house model **NICL**.

---

### **Step 2 â€” Install the Neuralk SDK**

Download our official Python library directly from PyPI:

```bash
pip install neuralk
```

---

### **Step 3 - Create your organization**

Once the SDK is installed, run the account creation command and enter your access code.
In just a few seconds, your organization â€” with its first user â€” will be created on the Neuralk platform.

---

### **Step 4 - Add users to your organization**

The first user in your organization can create additional users.  
At the end of the day, **everyone in your organization shares the same balance of credits** to use NICL.









## Step 1 : Get your Access Code

If you didn't receive an access code, please contact us.

### Step 1b: Configure credentials with environment variables (recommended)

To avoid hardcoding secrets in this notebook, create a `.env` file **next to this notebook** (or export variables in your shell) and load it with `python-dotenv`.

Example `.env`:

```bash
# Required to create an organization (optional step)
ACCESS_CODE="YOUR_ACCESS_CODE"
ORGANIZATION_NAME="YOUR_ORG_NAME"
ADMIN_EMAIL="admin@your-company.com"
ADMIN_PASSWORD="a-strong-password"

# Optional: control whether the guarded account creation cell runs
RUN_ACCOUNT_CREATION=false

# Optional: credentials used by the inference client (defaults to ADMIN_* if unset)
NEURALK_USERNAME="admin@your-company.com"
NEURALK_PASSWORD="a-strong-password"
```

Good practice:
- Never commit your `.env` file. Add it to `.gitignore`.
- Use different credentials per environment (dev, staging, prod).


## Step 2 : Install dependencies once

In [1]:
%%capture
%pip install -q neuralk scikit-learn python-dotenv xgboost skrub numpy

In [2]:
from dotenv import load_dotenv, find_dotenv

# Loads variables from a local .env file (if present).
# By default, we do NOT override already-exported environment variables.
load_dotenv(find_dotenv(), override=False)


True

In [3]:
# Imports and helpers
import os

# -----------------------------------------------------------------------------
# Environment variables
# -----------------------------------------------------------------------------
def _getenv(name: str, default: str | None = None, required: bool = False) -> str | None:
    value = os.getenv(name, default)
    if required and (value is None or str(value).strip() == ""):
        raise ValueError(
            f"Missing required environment variable: {name}. "
            "Create a .env file (see Step 1b) or export it in your shell."
        )
    return value

# Guarded: keep account creation off by default
RUN_ACCOUNT_CREATION = str(os.getenv("RUN_ACCOUNT_CREATION", "false")).lower() in {"1", "true", "yes", "y"}

# Credentials used to create an organization (only needed if RUN_ACCOUNT_CREATION is True)
ACCESS_CODE = _getenv("ACCESS_CODE", required=RUN_ACCOUNT_CREATION)
ORGANIZATION_NAME = _getenv("ORGANIZATION_NAME", required=RUN_ACCOUNT_CREATION)
ADMIN_EMAIL = _getenv("ADMIN_EMAIL", required=RUN_ACCOUNT_CREATION)
ADMIN_PASSWORD = _getenv("ADMIN_PASSWORD", required=RUN_ACCOUNT_CREATION)

# Credentials used by the inference client.
# If not provided, we fall back to ADMIN_* so users can configure once.
NEURALK_USERNAME = _getenv("NEURALK_USERNAME") or ADMIN_EMAIL
NEURALK_PASSWORD = _getenv("NEURALK_PASSWORD") or ADMIN_PASSWORD
import warnings
from dataclasses import dataclass
from typing import List

import numpy as np
from neuralk.neuralk import create_account
from neuralk import Neuralk, Classifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

RANDOM_STATE = 42
RUN_ACCOUNT_CREATION = True  # flip to True to actually create an org
RUN_USER_CREATION = False     # flip to True to create the extra user

def show_org(client: Neuralk) -> None:
    org = client.organization.get()
    print("Organization name:", org.name)
    print("Organization users:", [u.email for u in org.user_list])
    print("Organization has licence:", org.has_licence)
    print("Organization credits available:", client.organization.get_credits_available())

## Step 3 : Create your organization (optional guarded cell)

# Toggle RUN_ACCOUNT_CREATION = True only when you are ready to create the organization with the credentials above.

In [None]:
if RUN_ACCOUNT_CREATION:
    create_account(
        access_code=ACCESS_CODE,
        organization_name=ORGANIZATION_NAME,
        email=ADMIN_EMAIL,
        firstname="User_Firstname",
        lastname="User_Lastname",
        password=ADMIN_PASSWORD,
    )
else:
    print("Skipping account creation (set RUN_ACCOUNT_CREATION = True to run).")

### You can now connect to our platform

In [4]:
print("ADMIN_EMAIL loaded:", bool(ADMIN_EMAIL))
if ADMIN_EMAIL:
    print("ADMIN_EMAIL:", ADMIN_EMAIL)


ADMIN_EMAIL loaded: True
ADMIN_EMAIL: theo.marcolini@neuralk-ai.com


In [5]:
client = Neuralk(
    user_id=NEURALK_USERNAME,
    password=NEURALK_PASSWORD,
)

print("Connected as:", NEURALK_USERNAME)


Connected as: theo.marcolini@neuralk-ai.com


### Access to your organization

In [6]:
show_org(client)

Organization name: NeuralkAI
Organization users: ['theo.marcolini@neuralk-ai.com']
Organization has licence: False
Organization credits available: 4985.0


# Step 4:  Add Users to your organization

In [None]:
if RUN_USER_CREATION:
    client.users.create(
        email="user2@mycompany.com",
        firstname="User_Firstname",
        lastname="User_Lastname",
        password="some_password",
    )
    print("User created: user2@mycompany.com")
else:
    print("Skipping user creation (set RUN_USER_CREATION = True to run).")


In [None]:
if RUN_USER_CREATION:
    new_client = Neuralk(
        user_id="user2@mycompany.com",
        password="some_password",
    )

    print("Organization name: ", new_client.organization.get().name)
    print("Organization users: ", new_client.organization.get().user_list)
    print("Organization has licence: ", new_client.organization.get().has_licence)
    print("Organization credits available: ", new_client.organization.get_credits_available())
else:
    print("User2 not created; skipping secondary login.")

# Step 5: Running a simple classification problem

## Step 5: Synthetic quickstart (deterministic)

In [None]:
X, y = make_classification(random_state=RANDOM_STATE)
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.25,
    stratify=y,
    random_state=RANDOM_STATE,
)
print(f"{X_train.shape=} {y_train.shape=} {X_test.shape=} {y_test.shape=}")


In [None]:
# Authenticate the inference model.
# We use setdefault so we do not overwrite values if the user already exported them.
os.environ.setdefault("NEURALK_USERNAME", NEURALK_USERNAME or "")
os.environ.setdefault("NEURALK_PASSWORD", NEURALK_PASSWORD or "")

# Note: nothing actually happens during fit() -- in-context learning models are
# pretrained but require no fitting on our specific dataset.
classifier = Classifier().fit(X_train, y_train)

predictions = classifier.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")


# Step 6: Example On A Real Dataset

### **Download OpenML dataset**

For this tutorial we use the **GesturePhaseSegmentationProcessed** dataset from OpenML.


It contains numerical motion features extracted from 7 videos of people gesturing, including velocities and accelerations of both hands and wrists along x, y, and z coordinates, as well as scalar velocities. In summary:

- 9873 samples, 32 numerical features
- Each sample labeled with a gesture phase (D, P, S, H, R)
- Goal: Predict the gesture phase based on hand motion features.


In [7]:
import numpy as np

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OrdinalEncoder
from sklearn.impute import SimpleImputer
from sklearn.datasets import fetch_openml
from sklearn.pipeline import Pipeline

from skrub import TableVectorizer

# Load data
df = fetch_openml(data_id=4538) # 4538, 11, 1467, 1067

X, y = df.data, df.target

print(X.shape, y.shape)

(9873, 32) (9873,)


In [8]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set size: {X_train.shape[0]}")
print(f"Test set size: {X_test.shape[0]}")
print(f"Number of features: {X_train.shape[1]}")
print(f"Number of classes: {len(np.unique(y))}")

Training set size: 7898
Test set size: 1975
Number of features: 32
Number of classes: 5


In [9]:
vec = TableVectorizer(
    numeric=Pipeline([
        ("imputer", SimpleImputer(strategy="mean")),
        ("scaler", StandardScaler()),
    ]),
    low_cardinality=Pipeline([
        ("imputer", SimpleImputer(strategy="most_frequent")),
        ("encoder", OrdinalEncoder(handle_unknown="use_encoded_value", unknown_value=-1)),
    ]),
)

X_train_vec = vec.fit_transform(X_train)
X_test_vec = vec.transform(X_test)

print(f"  - Training shape: {X_train_vec.shape}")
print(f"  - Test shape: {X_test_vec.shape}")


  - Training shape: (7898, 32)
  - Test shape: (1975, 32)


### Start Making Predictions with NICL

Behind Neuralk's Classifier() is NICL, our foundation model pre-trained on millions of synthetic tabular datasets.

Instead of training a model from scratch on your data, NICL leverages an In-Context Learning (ICL) architecture that uses your training data as context, allowing you to make predictions instantly without traditional model training on your test set.

You can learn more about it here.

In [None]:
classifier = Classifier().fit(X_train_vec, y_train)

print("âœ“ Classifier fitted successfully -\
 In our case, this only means saving the X_train and y_train\n\
 in the classifier object for the inference")

# Make predictions
predictions = classifier.predict(X_test_vec)

acc = accuracy_score(y_test, predictions)
print(f"âœ“ Accuracy: {acc:.3f}")

âœ“ Classifier fitted successfully - In our case, this only means saving the X_train and y_train
 in the classifier object for the inference


### **Let's compare with a traditional model (XGBoost)**

In [None]:
import xgboost as xgb
from sklearn.preprocessing import LabelEncoder

# XGBoost requires integer class labels (0..num_classes-1)
le = LabelEncoder()
y_train_encoded = le.fit_transform(y_train)
y_test_encoded = le.transform(y_test)  # use same encoder

# Traditional models like XGBoost need explicit training and do not benefit
# from millions of priors learned across datasets.
xgb_model = xgb.XGBClassifier(
    n_estimators=300,
    max_depth=6,
    learning_rate=0.1,
    subsample=0.8,
    colsample_bytree=0.8,
    eval_metric="logloss",
    tree_method="hist",
    enable_categorical=True,
    verbosity=0,
    random_state=RANDOM_STATE,
)

xgb_model.fit(X_train_vec, y_train_encoded)

pred = xgb_model.predict(X_test_vec)
acc_xgb = accuracy_score(y_test_encoded, pred)
print("XGBoost Accuracy:", acc_xgb)

### Observations

In this example, NICL outperforms XGBoost, without any model training on the target data thanks to a powerful pre-training strategy and ICL architecture.

On real-world tabular datasets, NICL can achieve more than **30% higher performance** compared to classic ML methods. You can explore the full benchmark [here](https://dashboard.neuralk-ai.com/#/industrial/product-categorization).

**When to use**

NICL performs best on datasets with up to 250 features and 15,000 samples, offering reliable and consistent performance out of the box.

For larger problems, NICL can scale to 1 million samples and 500 features, though performance may vary depending on input complexity.

NICL is ideal when you:
- Need strong baseline performance without hyper-parameter tuning.
- Want a unified approach to handle mixed feature types.
- Are exploring new datasets and want fast iteration.
- Prefer interpretability and flexible prompting over black-box optimisation.

[Read the docs here.
](https://docs.neuralk-ai.com/docs/intro)