Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI #60

Merged
merged 14 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 0 additions & 109 deletions .github/workflows/gpu-ci.yml

This file was deleted.

102 changes: 102 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
name: GPU CI

on:
push:
branches:
- main
- "pull-request/[0-9]+"
tags:
- "v[0-9]+.[0-9]+.[0-9]+"

jobs:
pr-builder:
needs:
- python-gpu-tests
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@branch-24.06

python-gpu-tests:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.06
with:
build_type: pull-request
node_type: "gpu-v100-latest-1"
arch: "amd64"
container_image: "rapidsai/base:24.06-cuda12.2-py3.11"
run_script: "ci/test_gpu.sh"

# benchmark:
# runs-on: linux-amd64-gpu-p100-latest-1
# container:
# image: rapidsai/base:24.06-cuda12.2-py3.11
# env:
# NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
# options: --shm-size=1G -ulimit memlock=-1 --ulimit stack=67108864

# steps:
# - uses: actions/checkout@v3
# with:
# fetch-depth: 50 # this is to make sure we obtain the target base commit

# - name: Setup Environment
# shell: bash
# run: |
# git config --global --add safe.directory /__w/crossfit/crossfit

# echo "BASE_SHA=$(git ls-remote -q | grep refs/heads/main$ | awk '{print $1}' | xargs git rev-parse --short)" >> ${GITHUB_ENV}
# echo "PR_COMMENT=$(mktemp)" >> ${GITHUB_ENV}

# - name: Setup Environment (PR)
# if: ${{ github.event_name == 'pull_request' }}
# shell: bash
# run: |
# echo "HEAD_SHA=$(echo ${{ github.event.pull_request.head.sha }} | cut cut -c1-8)" >> ${GITHUB_ENV}

# - name: Setup Environment (Push)
# if: ${{ github.event_name == 'push' }}
# shell: bash
# run: |
# echo "HEAD_SHA=$(echo ${GITHUB_SHA} | cut -c1-8)" >> ${GITHUB_ENV}

# - name: Run benchmarks
# shell: bash
# run: |
# RUN_BENCHMARK="py.test -m benchmark tests/"

# git checkout ${{ env.BASE_SHA }}
# $RUN_BENCHMARK --benchmark-save=main

# git checkout ${{ env.HEAD_SHA }}
# $RUN_BENCHMARK --benchmark-save=${{ env.HEAD_SHA }}

# - name: Compare results
# run: |

# py.test-benchmark compare > cmp_results

# echo 'Benchmark comparison for [`${{ env.BASE_SHA }}`](${{ github.event.repository.html_url }}/commit/${{ env.BASE_SHA }}) (main) vs [`${{ env.HEAD_SHA }}`](${{ github.event.repository.html_url }}/commit/${{ env.HEAD_SHA }}) (PR)' >> pr_comment
# echo '```' >> pr_comment
# cat cmp_results >> pr_comment
# echo '```' >> pr_comment
# cat pr_comment > ${{ env.PR_COMMENT }}

# - name: 'Comment PR'
# if: github.ref != 'refs/heads/main'
# uses: actions/github-script@v6
# with:
# github-token: ${{ secrets.GITHUB_TOKEN }}
# script: |
# const pullRequests = await github.rest.pulls.list({
# owner: context.repo.owner,
# repo: context.repo.repo,
# state: 'open',
# })

# const filtered = pullRequests.data.filter(x => x.head.sha.startsWith('${{ env.HEAD_SHA }}'))

# github.rest.issues.createComment({
# issue_number: filtered[0].number,
# owner: context.repo.owner,
# repo: context.repo.repo,
# body: require('fs').readFileSync('${{ env.PR_COMMENT }}').toString()
# })
45 changes: 45 additions & 0 deletions ci/test_gpu.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash
# Enabling strict error handling
set -Eeuo pipefail

echo "Checking CUDA version in the conda environment..."

# Extract CUDA version from conda list output
CUDA_VERSION=$(conda list | grep 'cuda-version' | awk '{print $2}')

# Check if CUDA version was found
if [ -z "$CUDA_VERSION" ]; then
echo "CUDA version not found in the conda environment."
exit 1 # Exit with a non-zero status indicating failure
else
echo "CUDA version found: $CUDA_VERSION"
fi

echo "Installing pytorch,transformers and pytest to the environment for crossfit tests..."
mamba install \
cuda-version=$CUDA_VERSION \
conda-forge::pytorch \
conda-forge::transformers \
conda-forge::pytest \
-c conda-forge \
--override-channels \
--yes

# Have to install sentence-transformers from pip
# because conda-forge leads to a torch vision conflict
# which leads to it being installed on CPUs
pip3 install sentence-transformers sentencepiece

# Install the crossfit package in editable mode with test dependencies
pip3 install -e '.[test]'
# Running tests
echo "Running tests..."
pytest tests
# Capture the exit code of pytest
EXITCODE=$?

# Echo the exit code
echo "Crossfit test script exiting with value: ${EXITCODE}"

# Exit with the same code as pytest
exit ${EXITCODE}
2 changes: 1 addition & 1 deletion crossfit/backend/cudf/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def __init__(self):
def concatenate(self, series_list, *, axis=None):
return cudf.concat(series_list, axis=axis or 0)

np_backend_dispatch.register((cudf.Series, cudf.GenericIndex))(CudfBackend())
np_backend_dispatch.register((cudf.Series, cudf.Index))(CudfBackend())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if this is the right fix. CC: @rjzamora

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks correct to me. My understanding is that all "real" index objects inherit from Index now.

I suppose dask-cudf uses BaseIndex, but that dispatch was added a long time ago (and BaseIndex is probably unnecessary).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for confirming this .



@conversion.dispatch_to_dlpack.register_lazy("cudf")
Expand Down
2 changes: 1 addition & 1 deletion crossfit/backend/cudf/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def create_list_series_from_1d_or_2d_ar(ar, index):
return RuntimeError(f"Unexpected input shape: {ar.shape}")
data = as_column(ar.flatten())
offset_col = as_column(cp.arange(start=0, stop=len(data) + 1, step=n_cols), dtype="int32")
mask_col = cp.full(shape=n_rows, fill_value=True)
mask_col = cp.full(shape=n_rows, fill_value=cp.bool_(True))
mask = cudf._lib.transform.bools_to_mask(as_column(mask_col))
lc = cudf.core.column.ListColumn(
size=n_rows,
Expand Down
2 changes: 1 addition & 1 deletion crossfit/backend/torch/hf/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def fit_memory_estimate_curve(self, model=None):
}

try:
_ = model(batch)
_ = model(**batch)
memory_used = torch.cuda.max_memory_allocated() / (1024**2) # Convert to MB
X.append([batch_size, seq_len, seq_len**2])
y.append(memory_used)
Expand Down
2 changes: 1 addition & 1 deletion crossfit/op/combinators.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def __init__(self, *ops, pre=None, cols=False, repartition=None, keep_cols=None)
def call_dask(self, data):
for op in self.ops:
if self.repartition is not None:
data = data.repartition(self.repartition)
data = data.repartition(npartitions=self.repartition)

data = op(data)

Expand Down
2 changes: 1 addition & 1 deletion crossfit/op/vector_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def call_dask(self, queries, items, partition_num=10_000):
partitions = max(int(len(items) / partition_num), 1)
if not partitions % 2 == 0:
partitions += 1
_items = items.repartition(partitions)
_items = items.repartition(npartitions=partitions)

delayed_cross_products = []
for i in range(queries.npartitions):
Expand Down
26 changes: 20 additions & 6 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,24 @@
# Copyright 2024 NVIDIA CORPORATION
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import codecs
import itertools
import os

from setuptools import find_packages, setup

VERSION = "0.0.1"
VERSION = "0.0.2"


def get_long_description():
Expand Down Expand Up @@ -40,15 +54,15 @@ def read_requirements(filename):

setup(
name="crossfit",
description="Metric calculation library",
description="Offline inference and metric calculation library",
long_description=get_long_description(),
long_description_content_type="text/markdown",
author="NVIDIA Corporation",
url="https://github.com/NVIDIA-Merlin/crossfit",
url="https://github.com/rapidsai/crossfit/",
project_urls={
"Issues": "https://github.com/NVIDIA-Merlin/crossfit/issues",
"CI": "https://github.com/NVIDIA-Merlin/crossfit/actions",
"Changelog": "https://github.com/NVIDIA-Merlin/crossfit/releases",
"Issues": "https://github.com/rapidsai/crossfit/issues",
"CI": "https://github.com/rapidsai/crossfit/actions/",
"Changelog": "https://github.com/rapidsai/crossfit/releases",
},
license="Apache License, Version 2.0",
version=VERSION,
Expand Down
1 change: 1 addition & 0 deletions tests/examples/test_scripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
examples_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", "..", "examples")


@pytest.mark.skip(reason="This test is taking too long")
@pytest.mark.singlegpu
def test_beir_report():
path = os.path.join(examples_dir, "beir_report.py")
Expand Down
1 change: 1 addition & 0 deletions tests/metrics/ranking/test_ndcg.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import pytest

pytest.importorskip("cupy")
pytest.importorskip("pytrec_eval")

import numpy as np # noqa: E402
from pytrec_eval import RelevanceEvaluator # noqa: E402
Expand Down
2 changes: 2 additions & 0 deletions tests/metrics/ranking/test_precision.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import pytest

pytest.importorskip("cupy")
pytest.importorskip("pytrec_eval")


import numpy as np # noqa: E402
from pytrec_eval import RelevanceEvaluator # noqa: E402
Expand Down
2 changes: 2 additions & 0 deletions tests/metrics/ranking/test_recall.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import pytest

pytest.importorskip("cupy")
pytest.importorskip("pytrec_eval")


import numpy as np # noqa: E402
from pytrec_eval import RelevanceEvaluator # noqa: E402
Expand Down
1 change: 1 addition & 0 deletions tests/report/beir/test_embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import pytest

cp = pytest.importorskip("cupy")
sentece_transformers = pytest.importorskip("sentence_transformers")

import crossfit as cf # noqa: E402

Expand Down
Loading
Loading