Skip to content

Conversation

@ndharasz
Copy link
Contributor

@ndharasz ndharasz commented Oct 8, 2025

No description provided.

Copilot AI review requested due to automatic review settings October 8, 2025 00:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a new example Jupyter notebook demonstrating downloading Numerai data, training a LightGBM model, and exporting a pickled predict function for upload.
Key changes:

  • Introduces example_model.ipynb with data download, model training, and serialization steps.
  • Includes environment setup cell installing dependencies.
  • Defines a predict function and exports it via cloudpickle.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

],
"source": [
"# Install dependencies\n",
"!pip install -q --upgrade numerapi pandas pyarrow matplotlib lightgbm scikit-learn scipy cloudpickle==3.1.1"
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpinned bulk upgrading of multiple core libraries is causing the dependency conflicts shown in the cell output (pandas, scipy, scikit-learn, pyarrow). Recommend pinning compatible versions (or using a requirements.txt) and avoiding --upgrade unless necessary to ensure reproducible, conflict-free environments.

Suggested change
"!pip install -q --upgrade numerapi pandas pyarrow matplotlib lightgbm scikit-learn scipy cloudpickle==3.1.1"
"!pip install -q numerapi==0.7.9 pandas==2.2.2 pyarrow==14.0.2 matplotlib==3.7.1 lightgbm==4.3.0 scikit-learn==1.3.0 scipy==1.11.4 cloudpickle==3.1.1"

Copilot uses AI. Check for mistakes.
"napi.download_dataset(f\"{DATA_VERSION}/features.json\")\n",
"\n",
"# Load data\n",
"feature_metadata = json.load(open(f\"{DATA_VERSION}/features.json\"))\n",
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is opened without a context manager; this can leak file descriptors if an exception occurs. Prefer using: with open(f"{DATA_VERSION}/features.json") as f: feature_metadata = json.load(f).

Suggested change
"feature_metadata = json.load(open(f\"{DATA_VERSION}/features.json\"))\n",
"with open(f\"{DATA_VERSION}/features.json\") as f:\n",
" feature_metadata = json.load(f)\n",

Copilot uses AI. Check for mistakes.
Comment on lines +226 to +233
"# Define predict function\n",
"def predict(\n",
" live_features: pd.DataFrame,\n",
" live_benchmark_models: pd.DataFrame\n",
" ) -> pd.DataFrame:\n",
" live_predictions = model.predict(live_features[features])\n",
" submission = pd.Series(live_predictions, index=live_features.index)\n",
" return submission.to_frame(\"prediction\")\n",
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter live_benchmark_models is accepted but never used, which can confuse users of the example. Either remove it or document its intended future use.

Copilot uses AI. Check for mistakes.
Comment on lines +245 to +246
"except:\n",
" pass"
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare except swallows all errors (including unexpected runtime issues) making debugging harder. Narrow this to except ImportError (or specific exceptions) and optionally log a brief message.

Suggested change
"except:\n",
" pass"
"except ImportError:\n",
" print('google.colab not available; skipping file download.')"

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +36
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Ekw8Z93ljC3v",
"outputId": "bdd16698-2ad0-4423-b090-c5ce55fe3053"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Python 3.11.13\n"
]
}
],
"source": [
"!python --version"
]
},
{
"cell_type": "code",
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Kernel metadata specifies Python 3.10.12 while the executed environment reports Python 3.11.13, creating reproducibility ambiguity. Align the kernelspec version with the actual runtime or remove the explicit version display cell.

Suggested change
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Ekw8Z93ljC3v",
"outputId": "bdd16698-2ad0-4423-b090-c5ce55fe3053"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Python 3.11.13\n"
]
}
],
"source": [
"!python --version"
]
},
{
"cell_type": "code",

Copilot uses AI. Check for mistakes.
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Kernel metadata specifies Python 3.10.12 while the executed environment reports Python 3.11.13, creating reproducibility ambiguity. Align the kernelspec version with the actual runtime or remove the explicit version display cell.

Suggested change
"version": "3.10.12"
"version": "3.11.13"

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings October 8, 2025 00:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 6 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

"output_type": "stream",
"name": "stdout",
"text": [
"Python 3.11.13\n"
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime reports Python 3.11.13 while the notebook metadata declares 3.10.12; this mismatch can cause confusion or environment re-creation issues. Update the kernelspec/language_info version to match the actual runtime or remove the hard-coded version if managed externally.

Copilot uses AI. Check for mistakes.
Comment on lines +227 to +233
"def predict(\n",
" live_features: pd.DataFrame,\n",
" live_benchmark_models: pd.DataFrame\n",
" ) -> pd.DataFrame:\n",
" live_predictions = model.predict(live_features[features])\n",
" submission = pd.Series(live_predictions, index=live_features.index)\n",
" return submission.to_frame(\"prediction\")\n",
Copy link

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter 'live_benchmark_models' is never used inside predict; if required only for interface compliance, prefix it with '_' (e.g. _live_benchmark_models) or add a brief comment to clarify intent. Otherwise remove it to reduce noise.

Copilot uses AI. Check for mistakes.
@ndharasz ndharasz merged commit 2bd9844 into master Oct 8, 2025
@ndharasz ndharasz deleted the ndharasz/unified-example-scripts branch October 8, 2025 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants