Skip to content

Generalize the fix for Pandas extension dtypes#58

Merged
albert17 merged 1 commit intoNVIDIA-Merlin:mainfrom
karlhigley:fix/string-dtype
Apr 4, 2022
Merged

Generalize the fix for Pandas extension dtypes#58
albert17 merged 1 commit intoNVIDIA-Merlin:mainfrom
karlhigley:fix/string-dtype

Conversation

@karlhigley
Copy link
Copy Markdown
Contributor

This updated version of the code checks the kind property (which returns O by default), which makes the previous fix applicable to additional Pandas extension dtypes.

This updated version of the code checks the `kind` property (which returns `O` by default), which makes the previous fix applicable to additional Pandas extension dtypes.
@karlhigley karlhigley added this to the Merlin 22.04 milestone Apr 2, 2022
@karlhigley karlhigley requested a review from albert17 April 2, 2022 18:19
@karlhigley karlhigley self-assigned this Apr 2, 2022
@nvidia-merlin-bot
Copy link
Copy Markdown

Click to view CI Results
GitHub pull request #58 of commit 3a700dab808aedddbe602c209ee58851e15574ad, no merge conflicts.
Running as SYSTEM
Setting status of 3a700dab808aedddbe602c209ee58851e15574ad to PENDING with url https://10.20.13.93:8080/job/merlin_core/8/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_core
using credential ce87ff3c-94f0-400a-8303-cb4acb4918b5
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/core # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/core
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems username and pass
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/core +refs/pull/58/*:refs/remotes/origin/pr/58/* # timeout=10
 > git rev-parse 3a700dab808aedddbe602c209ee58851e15574ad^{commit} # timeout=10
Checking out Revision 3a700dab808aedddbe602c209ee58851e15574ad (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3a700dab808aedddbe602c209ee58851e15574ad # timeout=10
Commit message: "Generalize the fix for Pandas extension dtypes"
 > git rev-list --no-walk 78ddc20664373c5283c8efa64fbb9aca68d2c82f # timeout=10
[merlin_core] $ /bin/bash /tmp/jenkins5277764248889785850.sh
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: setuptools in /usr/local/lib/python3.8/dist-packages (61.3.1)
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.1, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_core/core, configfile: pyproject.toml
plugins: xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 337 items / 1 skipped

tests/unit/core/test_dispatch.py .. [ 0%]
tests/unit/dag/test_base_operator.py .... [ 1%]
tests/unit/dag/test_column_selector.py .......................... [ 9%]
tests/unit/dag/test_tags.py ...... [ 11%]
tests/unit/dag/ops/test_selection.py ... [ 12%]
tests/unit/io/test_io.py ............................................... [ 26%]
................................................................ [ 45%]
tests/unit/schema/test_column_schemas.py ............................... [ 54%]
........................................................................ [ 75%]
........................................................................ [ 97%]
[ 97%]
tests/unit/schema/test_schema_io.py .. [ 97%]
tests/unit/utils/test_utils.py ........ [100%]

=============================== warnings summary ===============================
tests/unit/dag/test_base_operator.py: 4 warnings
tests/unit/io/test_io.py: 73 warnings
/usr/lib/python3.8/site-packages/cudf/core/dataframe.py:1253: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/io/test_io.py: 145 warnings
/var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg/partd/pandas.py:113: DeprecationWarning: The Index._get_attributes_dict method is deprecated, and will be removed in a future version
header = (type(ind), ind._get_attributes_dict(), values.dtype, cat)

tests/unit/io/test_io.py::test_validate_and_regenerate_dataset
/var/jenkins_home/workspace/merlin_core/core/merlin/io/parquet.py:535: DeprecationWarning: 'ParquetDataset.pieces' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version. Specify 'use_legacy_dataset=False' while constructing the ParquetDataset, and then use the '.fragments' attribute instead.
paths = [p.path for p in pa_dataset.pieces]

tests/unit/utils/test_utils.py::test_nvt_distributed[True-True]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 45047 instead
warnings.warn(

tests/unit/utils/test_utils.py::test_nvt_distributed[True-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 38331 instead
warnings.warn(

tests/unit/utils/test_utils.py::test_nvt_distributed[False-True]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44465 instead
warnings.warn(

tests/unit/utils/test_utils.py::test_nvt_distributed[False-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 45833 instead
warnings.warn(

tests/unit/utils/test_utils.py::test_nvt_distributed_force[True]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 38435 instead
warnings.warn(

tests/unit/utils/test_utils.py::test_nvt_distributed_force[False]
/var/jenkins_home/.local/lib/python3.8/site-packages/distributed/node.py:160: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36791 instead
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================ 337 passed, 1 skipped, 229 warnings in 52.88s =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/core/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_core] $ /bin/bash /tmp/jenkins1661941692584412441.sh

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2022

Documentation preview

https://nvidia-merlin.github.io/core/review/pr-58

@albert17 albert17 merged commit 047a38a into NVIDIA-Merlin:main Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants