Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Handle ptx file paths during strings_udf import #11862

Merged
merged 10 commits into from
Oct 5, 2022

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Oct 4, 2022

Description

strings_udf is currently throwing an error on import when source compiled with only one gpu arch:

----> 1 import cudf

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/__init__.py:20
     14 from cudf.api.extensions import (
     15     register_dataframe_accessor,
     16     register_index_accessor,
     17     register_series_accessor,
     18 )
     19 from cudf.api.types import dtype
---> 20 from cudf.core.algorithms import factorize
     21 from cudf.core.cut import cut
     22 from cudf.core.dataframe import DataFrame, from_dataframe, from_pandas, merge

.
.
.
File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/indexed_frame.py:60
     58 from cudf.core.multiindex import MultiIndex
     59 from cudf.core.resample import _Resampler
---> 60 from cudf.core.udf.utils import (
     61     _compile_or_get,
     62     _get_input_args_from_frame,
     63     _post_process_output_col,
     64     _return_arr_from_dtype,
     65 )
     66 from cudf.utils import docutils
     67 from cudf.utils.utils import _cudf_nvtx_annotate

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/udf/__init__.py:27
     25 _STRING_UDFS_ENABLED = False
     26 try:
---> 27     import strings_udf
     29     if strings_udf.ENABLED:
     30         from . import strings_typing  # isort: skip

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/strings_udf/__init__.py:66
     62 import pdb;pdb.set_trace()
     63 sms = [
     64     os.path.basename(f).rstrip(".ptx").lstrip("shim_") for f in files
     65 ]
---> 66 selected_sm = max(sm for sm in sms if sm < cc)
     67 ptxpath = os.path.join(
     68     os.path.dirname(__file__), f"shim_{selected_sm}.ptx"
     69 )
     71 if driver_version >= compiler_from_ptx_file(ptxpath):

ValueError: max() arg is an empty sequence

This PR fixes the above issue and also handles shim_sm-real.ptx and shim_sm.ptx files appropriately.

Checklist

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels Oct 4, 2022
@galipremsagar galipremsagar self-assigned this Oct 4, 2022
@galipremsagar galipremsagar requested a review from a team as a code owner October 4, 2022 23:26
@galipremsagar galipremsagar added this to PR-WIP in v22.10 Release via automation Oct 4, 2022
@galipremsagar galipremsagar requested review from wence- and shwina and removed request for a team October 4, 2022 23:26
@github-actions github-actions bot added the Python Affects Python cuDF API. label Oct 4, 2022
@galipremsagar galipremsagar moved this from PR-WIP to PR-Needs review in v22.10 Release Oct 4, 2022
python/strings_udf/strings_udf/__init__.py Outdated Show resolved Hide resolved
python/strings_udf/strings_udf/__init__.py Outdated Show resolved Hide resolved
python/strings_udf/strings_udf/__init__.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 5, 2022

Codecov Report

Base: 87.52% // Head: 87.51% // Decreases project coverage by -0.01% ⚠️

Coverage data is based on head (d321b0e) compared to base (dfd3d89).
Patch coverage: 75.00% of modified lines in pull request are covered.

❗ Current head d321b0e differs from pull request most recent head 3011d96. Consider uploading reports for the commit 3011d96 to get more accurate results

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.10   #11862      +/-   ##
================================================
- Coverage         87.52%   87.51%   -0.02%     
================================================
  Files               133      133              
  Lines             21807    21826      +19     
================================================
+ Hits              19087    19100      +13     
- Misses             2720     2726       +6     
Impacted Files Coverage Δ
python/strings_udf/strings_udf/__init__.py 86.27% <75.00%> (-10.61%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

native_sms = []
for f in files:
file_name = os.path.basename(f)
sm_number = int(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can't be a number. SM's aren't all numeric values any more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear the dev.compute_capability value will always be an integer. ( 75, 80, 90, ... ). But going forward we will see code gen with values like 90a which means they hold sm90 only instructions that can't be used by newer archs

for f in files:
file_name = os.path.basename(f)
sm_number = int(
file_name.rstrip(".ptx").lstrip("shim_").rstrip("-real")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to strip -real and -virtual

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be best to update the CMake side to transform CMAKE_CUDA_ARCHITECTURES to be a unique set of architectures without any -real and -virtual postfixes so the python loading side has a 1 to 1 mapping.

python/strings_udf/strings_udf/__init__.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the CMake CMake build issue label Oct 5, 2022
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing, Thanks!

v22.10 Release automation moved this from PR-Needs review to PR-Reviewer approved Oct 5, 2022
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 5, 2022
@galipremsagar galipremsagar changed the title [REVIEW] Handle ptx file paths [REVIEW] Handle ptx file paths during strings_udf import Oct 5, 2022
@galipremsagar
Copy link
Contributor Author

rerun tests

Co-authored-by: robertmaynard <robertmaynard@users.noreply.github.com>
@jolorunyomi jolorunyomi merged commit 59a3152 into rapidsai:branch-22.10 Oct 5, 2022
v22.10 Release automation moved this from PR-Reviewer approved to Done Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working CMake CMake build issue non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants