unify organization of all example scripts under 1 repo #176

ndharasz · 2025-10-08T00:34:12Z

No description provided.

Copilot

Pull Request Overview

Adds a new example Jupyter notebook demonstrating downloading Numerai data, training a LightGBM model, and exporting a pickled predict function for upload.
Key changes:

Introduces example_model.ipynb with data download, model training, and serialization steps.
Includes environment setup cell installing dependencies.
Defines a predict function and exports it via cloudpickle.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-08T00:35:12Z

numerai/example_model.ipynb

+      ],
+      "source": [
+        "# Install dependencies\n",
+        "!pip install -q --upgrade numerapi pandas pyarrow matplotlib lightgbm scikit-learn scipy cloudpickle==3.1.1"


Unpinned bulk upgrading of multiple core libraries is causing the dependency conflicts shown in the cell output (pandas, scipy, scikit-learn, pyarrow). Recommend pinning compatible versions (or using a requirements.txt) and avoiding --upgrade unless necessary to ensure reproducible, conflict-free environments.

Suggested change

"!pip install -q --upgrade numerapi pandas pyarrow matplotlib lightgbm scikit-learn scipy cloudpickle==3.1.1"

"!pip install -q numerapi==0.7.9 pandas==2.2.2 pyarrow==14.0.2 matplotlib==3.7.1 lightgbm==4.3.0 scikit-learn==1.3.0 scipy==1.11.4 cloudpickle==3.1.1"

Copilot · 2025-10-08T00:35:12Z

numerai/example_model.ipynb

+        "napi.download_dataset(f\"{DATA_VERSION}/features.json\")\n",
+        "\n",
+        "# Load data\n",
+        "feature_metadata = json.load(open(f\"{DATA_VERSION}/features.json\"))\n",


File is opened without a context manager; this can leak file descriptors if an exception occurs. Prefer using: with open(f"{DATA_VERSION}/features.json") as f: feature_metadata = json.load(f).

Suggested change

"feature_metadata = json.load(open(f\"{DATA_VERSION}/features.json\"))\n",

"with open(f\"{DATA_VERSION}/features.json\") as f:\n",

" feature_metadata = json.load(f)\n",

Copilot · 2025-10-08T00:35:13Z

numerai/example_model.ipynb

+        "# Define predict function\n",
+        "def predict(\n",
+        "    live_features: pd.DataFrame,\n",
+        "    live_benchmark_models: pd.DataFrame\n",
+        " ) -> pd.DataFrame:\n",
+        "    live_predictions = model.predict(live_features[features])\n",
+        "    submission = pd.Series(live_predictions, index=live_features.index)\n",
+        "    return submission.to_frame(\"prediction\")\n",


The parameter live_benchmark_models is accepted but never used, which can confuse users of the example. Either remove it or document its intended future use.

Copilot · 2025-10-08T00:35:13Z

numerai/example_model.ipynb

+        "except:\n",
+        "    pass"


Bare except swallows all errors (including unexpected runtime issues) making debugging harder. Narrow this to except ImportError (or specific exceptions) and optionally log a brief message.

Suggested change

"except:\n",

" pass"

"except ImportError:\n",

" print('google.colab not available; skipping file download.')"

Copilot · 2025-10-08T00:35:13Z

numerai/example_model.ipynb

+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "Ekw8Z93ljC3v",
+        "outputId": "bdd16698-2ad0-4423-b090-c5ce55fe3053"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Python 3.11.13\n"
+          ]
+        }
+      ],
+      "source": [
+        "!python --version"
+      ]
+    },
+    {
+      "cell_type": "code",


[nitpick] Kernel metadata specifies Python 3.10.12 while the executed environment reports Python 3.11.13, creating reproducibility ambiguity. Align the kernelspec version with the actual runtime or remove the explicit version display cell.

Suggested change

"execution_count": 1,

"metadata": {

"colab": {

"base_uri": "https://localhost:8080/"

},

"id": "Ekw8Z93ljC3v",

"outputId": "bdd16698-2ad0-4423-b090-c5ce55fe3053"

},

"outputs": [

{

"output_type": "stream",

"name": "stdout",

"text": [

"Python 3.11.13\n"

]

}

],

"source": [

"!python --version"

]

},

{

"cell_type": "code",

Copilot · 2025-10-08T00:35:13Z

numerai/example_model.ipynb

+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.10.12"


[nitpick] Kernel metadata specifies Python 3.10.12 while the executed environment reports Python 3.11.13, creating reproducibility ambiguity. Align the kernelspec version with the actual runtime or remove the explicit version display cell.

Suggested change

"version": "3.10.12"

"version": "3.11.13"

Copilot

Pull Request Overview

Copilot reviewed 1 out of 6 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-08T00:42:40Z

numerai/example_model.ipynb

+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Python 3.11.13\n"


The runtime reports Python 3.11.13 while the notebook metadata declares 3.10.12; this mismatch can cause confusion or environment re-creation issues. Update the kernelspec/language_info version to match the actual runtime or remove the hard-coded version if managed externally.

Copilot · 2025-10-08T00:42:40Z

numerai/example_model.ipynb

+        "def predict(\n",
+        "    live_features: pd.DataFrame,\n",
+        "    live_benchmark_models: pd.DataFrame\n",
+        " ) -> pd.DataFrame:\n",
+        "    live_predictions = model.predict(live_features[features])\n",
+        "    submission = pd.Series(live_predictions, index=live_features.index)\n",
+        "    return submission.to_frame(\"prediction\")\n",


The parameter 'live_benchmark_models' is never used inside predict; if required only for interface compliance, prefix it with '_' (e.g. _live_benchmark_models) or add a brief comment to clarify intent. Otherwise remove it to reduce noise.

unify organization of all example scripts under 1 repo

694c841

Copilot AI review requested due to automatic review settings October 8, 2025 00:34

Copilot AI reviewed Oct 8, 2025

View reviewed changes

ndharasz added 2 commits October 7, 2025 17:41

update crypto example model to work correctly

ac6df0f

remove debug cells

55a6e4f

Copilot AI review requested due to automatic review settings October 8, 2025 00:41

Copilot AI reviewed Oct 8, 2025

View reviewed changes

made updates to crypto notebook for pkl'ing

4ab330d

ndharasz merged commit 2bd9844 into master Oct 8, 2025

ndharasz deleted the ndharasz/unified-example-scripts branch October 8, 2025 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

unify organization of all example scripts under 1 repo #176

unify organization of all example scripts under 1 repo #176

Uh oh!

ndharasz commented Oct 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Copilot AI Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	"!pip install -q --upgrade numerapi pandas pyarrow matplotlib lightgbm scikit-learn scipy cloudpickle==3.1.1"
	"!pip install -q numerapi==0.7.9 pandas==2.2.2 pyarrow==14.0.2 matplotlib==3.7.1 lightgbm==4.3.0 scikit-learn==1.3.0 scipy==1.11.4 cloudpickle==3.1.1"

	"feature_metadata = json.load(open(f\"{DATA_VERSION}/features.json\"))\n",
	"with open(f\"{DATA_VERSION}/features.json\") as f:\n",
	" feature_metadata = json.load(f)\n",

-        "except:\n",
-        "    pass"
+        "except ImportError:\n",
+        "    print('google.colab not available; skipping file download.')"

unify organization of all example scripts under 1 repo #176

unify organization of all example scripts under 1 repo #176

Uh oh!

Conversation

ndharasz commented Oct 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants