Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ops for feature column functionality and feature column to workflow mapping function #379

Merged
merged 33 commits into from Nov 3, 2020

Conversation

alecgunny
Copy link
Contributor

Increasing NVTabular compatibility with TensorFlow feature column API by adding remaining necessary ops (cross op and bucketize) and a function which can map from a set of feature columns to an NVTabular workflow that performs all analogous preprocessing. Addresses #371

HashedCross doesn't support multi-hot yet, and I'm not sure that extending to it will be necessarily easy. For reference, the TF cross op handles multi-hots by doing a cartesian product of all indices for each feature. See the documentation here.

Still need to add bucketized support and test everything.

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit bd068b8168424c4a151775173c529db7c07c6720, no merge conflicts.
Running as SYSTEM
Setting status of bd068b8168424c4a151775173c529db7c07c6720 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1005/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse bd068b8168424c4a151775173c529db7c07c6720^{commit} # timeout=10
Checking out Revision bd068b8168424c4a151775173c529db7c07c6720 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f bd068b8168424c4a151775173c529db7c07c6720 # timeout=10
Commit message: "adding ops and feature column utils"
 > git rev-list --no-walk 171491a2233ecaa82788cf026f779e0c39e8b87a # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7714710515846326646.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins2911655224900517134.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 951b025ce20f4a29d0949d6223a9f93cee0dc820, no merge conflicts.
Running as SYSTEM
Setting status of 951b025ce20f4a29d0949d6223a9f93cee0dc820 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1006/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 951b025ce20f4a29d0949d6223a9f93cee0dc820^{commit} # timeout=10
Checking out Revision 951b025ce20f4a29d0949d6223a9f93cee0dc820 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 951b025ce20f4a29d0949d6223a9f93cee0dc820 # timeout=10
Commit message: "importing hashed cross in ops"
 > git rev-list --no-walk bd068b8168424c4a151775173c529db7c07c6720 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins3759065021120467545.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins61563149220576116.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5, no merge conflicts.
Running as SYSTEM
Setting status of 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1007/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5^{commit} # timeout=10
Checking out Revision 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 # timeout=10
Commit message: "switching to xor"
 > git rev-list --no-walk 951b025ce20f4a29d0949d6223a9f93cee0dc820 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6439279322316503336.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3780925671695674544.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 319d475f2526c3d95968e1d09476b036d2d3e0d1, no merge conflicts.
Running as SYSTEM
Setting status of 319d475f2526c3d95968e1d09476b036d2d3e0d1 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1009/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 319d475f2526c3d95968e1d09476b036d2d3e0d1^{commit} # timeout=10
Checking out Revision 319d475f2526c3d95968e1d09476b036d2d3e0d1 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 319d475f2526c3d95968e1d09476b036d2d3e0d1 # timeout=10
Commit message: "hashed cross and workflow builder working"
 > git rev-list --no-walk f5a6ddd36454d7f0c19634070c801af0597e3b9f # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2894470993341789209.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8960642179791915675.sh

@alecgunny
Copy link
Contributor Author

@benfred what's our stance on adding ops without list support? If we're ok with it, should we add a support matrix in the documentation?

@alecgunny
Copy link
Contributor Author

With the addition of bucketize, we should have full TF feature column coverage (minus the sequence columns which I won't worry about for now). The shared embeddings and weighted shared embeddings are more of Keras layers than they would be preprocessing steps, so will still need to add layers that cover those. But overall this should put us in pretty good shape. Just need to build tests, docs, and an example and we should be good to go

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed, no merge conflicts.
Running as SYSTEM
Setting status of 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1011/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed^{commit} # timeout=10
Checking out Revision 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed # timeout=10
Commit message: "adding bucketize"
 > git rev-list --no-walk 010157a6e70d28c90c508e0b3430fc2a76a6cd14 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins6823510643244802508.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
2 files would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8538873942078219243.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit d2f1efd8ae32c71cca4daa954b66a56f3b6ca126, no merge conflicts.
Running as SYSTEM
Setting status of d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1012/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse d2f1efd8ae32c71cca4daa954b66a56f3b6ca126^{commit} # timeout=10
Checking out Revision d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 # timeout=10
Commit message: "adding op tests"
 > git rev-list --no-walk 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins9117988078583547627.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
3 files would be reformatted, 72 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3583608757503062890.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8, no merge conflicts.
Running as SYSTEM
Setting status of 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1013/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8^{commit} # timeout=10
Checking out Revision 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6095631252068171384.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
4 files would be reformatted, 71 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins156390319430102341.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit ebf234766dc5efb29c5d858f27eff061ca267c7f, no merge conflicts.
Running as SYSTEM
Setting status of ebf234766dc5efb29c5d858f27eff061ca267c7f to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1014/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse ebf234766dc5efb29c5d858f27eff061ca267c7f^{commit} # timeout=10
Checking out Revision ebf234766dc5efb29c5d858f27eff061ca267c7f (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ebf234766dc5efb29c5d858f27eff061ca267c7f # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7642891004762184407.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/benchmarks/test_notebooks.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
5 files would be reformatted, 70 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1940264879864262767.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 66419f4267583bbdbb4302757fd016c1a88efd94, no merge conflicts.
Running as SYSTEM
Setting status of 66419f4267583bbdbb4302757fd016c1a88efd94 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1015/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 66419f4267583bbdbb4302757fd016c1a88efd94^{commit} # timeout=10
Checking out Revision 66419f4267583bbdbb4302757fd016c1a88efd94 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 66419f4267583bbdbb4302757fd016c1a88efd94 # timeout=10
Commit message: "documenting"
 > git rev-list --no-walk ebf234766dc5efb29c5d858f27eff061ca267c7f # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins5972829207961115146.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/benchmarks/test_notebooks.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
5 files would be reformatted, 70 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins4954172438073774049.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 1446a6407a3c1d468eba5430b919e13c23f49771, no merge conflicts.
Running as SYSTEM
Setting status of 1446a6407a3c1d468eba5430b919e13c23f49771 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1016/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 1446a6407a3c1d468eba5430b919e13c23f49771^{commit} # timeout=10
Checking out Revision 1446a6407a3c1d468eba5430b919e13c23f49771 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 1446a6407a3c1d468eba5430b919e13c23f49771 # timeout=10
Commit message: "updated blackening"
 > git rev-list --no-walk 66419f4267583bbdbb4302757fd016c1a88efd94 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7595550585448599460.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/framework_utils/tensorflow/__init__.py:17:1: F401 '.feature_column_utils.make_feature_column_workflow' imported but unused
./nvtabular/ops/hashed_cross.py:17:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/hashed_cross.py:20:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:18:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/bucketize.py:21:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:29:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7414266527318070797.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 8b711512cfca966e3b5dfb6c7b4560aa353d97e8, no merge conflicts.
Running as SYSTEM
Setting status of 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1017/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 8b711512cfca966e3b5dfb6c7b4560aa353d97e8^{commit} # timeout=10
Checking out Revision 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 1446a6407a3c1d468eba5430b919e13c23f49771 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2182059037975933060.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/ops/bucketize.py:18:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/bucketize.py:21:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:29:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins4688283507400897510.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 4afd77285818b2e75637fcbc59024793d41e311b, no merge conflicts.
Running as SYSTEM
Setting status of 4afd77285818b2e75637fcbc59024793d41e311b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1018/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4afd77285818b2e75637fcbc59024793d41e311b^{commit} # timeout=10
Checking out Revision 4afd77285818b2e75637fcbc59024793d41e311b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4afd77285818b2e75637fcbc59024793d41e311b # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins8789470020834365298.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/ops/bucketize.py:30:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7235389498370303335.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 5a22fb406220010733c1bd3221d2542ad483a0bd, no merge conflicts.
Running as SYSTEM
Setting status of 5a22fb406220010733c1bd3221d2542ad483a0bd to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1019/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5a22fb406220010733c1bd3221d2542ad483a0bd^{commit} # timeout=10
Checking out Revision 5a22fb406220010733c1bd3221d2542ad483a0bd (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5a22fb406220010733c1bd3221d2542ad483a0bd # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 4afd77285818b2e75637fcbc59024793d41e311b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins4031815495685499956.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/ops/bucketize.py:30:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3387745480078041911.sh

@alecgunny
Copy link
Contributor Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 5c39ed49cd67ae3d969da6c4be3889d3f871f6de, no merge conflicts.
Running as SYSTEM
Setting status of 5c39ed49cd67ae3d969da6c4be3889d3f871f6de to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1033/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5c39ed49cd67ae3d969da6c4be3889d3f871f6de^{commit} # timeout=10
Checking out Revision 5c39ed49cd67ae3d969da6c4be3889d3f871f6de (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5c39ed49cd67ae3d969da6c4be3889d3f871f6de # timeout=10
Commit message: "fixing bucketization to workf properly"
 > git rev-list --no-walk 00ecf2c908a7070e8bd5d3929ce6f422c0d12200 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1110005061003603298.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36463 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8ac437ef10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a486d3510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a486d3510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a881542d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a881542d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a881542d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a484ed410>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a8809d9d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a8809d9d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a487eabd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a487eabd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a487eabd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 41256 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 110 103 75 0 4% 45-206
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3148 369 1320 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.89%
================ 553 passed, 212 warnings in 453.28s (0:07:33) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins8860082276376206715.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 03a50634faac5b72f5ceb3be7158ffbf61794ed4, no merge conflicts.
Running as SYSTEM
Setting status of 03a50634faac5b72f5ceb3be7158ffbf61794ed4 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1034/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 03a50634faac5b72f5ceb3be7158ffbf61794ed4^{commit} # timeout=10
Checking out Revision 03a50634faac5b72f5ceb3be7158ffbf61794ed4 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 03a50634faac5b72f5ceb3be7158ffbf61794ed4 # timeout=10
Commit message: "fixing bucketized behavior"
 > git rev-list --no-walk 5c39ed49cd67ae3d969da6c4be3889d3f871f6de # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1988088860742501560.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46509 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f77f1e89990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77f76faa90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77f76faa90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f78bc7c84d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f78bc7c84d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f78bc7c84d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f78bc86c750>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77dc371450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77dc371450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77c4395fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77c4395fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f77c4395fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 121 113 81 0 4% 12-16, 53-249
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3159 379 1326 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.59%
================ 553 passed, 212 warnings in 460.88s (0:07:40) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins6667586222959812840.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 7cc9ec6e4f88701da8a5398b8477887154691864, no merge conflicts.
Running as SYSTEM
Setting status of 7cc9ec6e4f88701da8a5398b8477887154691864 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1035/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 7cc9ec6e4f88701da8a5398b8477887154691864^{commit} # timeout=10
Checking out Revision 7cc9ec6e4f88701da8a5398b8477887154691864 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 7cc9ec6e4f88701da8a5398b8477887154691864 # timeout=10
Commit message: "fixing some bucket stuff"
 > git rev-list --no-walk 03a50634faac5b72f5ceb3be7158ffbf61794ed4 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1794544947207974246.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44441 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f42705d8250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f42705e8450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f42705e8450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f427054e290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f427054e290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f427054e290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f427055c890>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f427055c850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f427055c850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f4270563cd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f4270563cd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f4270563cd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38520 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40032 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-253
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3163 383 1326 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.52%
================ 553 passed, 212 warnings in 452.05s (0:07:32) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins8405706607108263696.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 5b8cdf05aefb51e9daba771493c791144df7adc2, no merge conflicts.
Running as SYSTEM
Setting status of 5b8cdf05aefb51e9daba771493c791144df7adc2 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1036/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5b8cdf05aefb51e9daba771493c791144df7adc2^{commit} # timeout=10
Checking out Revision 5b8cdf05aefb51e9daba771493c791144df7adc2 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5b8cdf05aefb51e9daba771493c791144df7adc2 # timeout=10
Commit message: "changing preprocess and features in feature column utils"
 > git rev-list --no-walk 7cc9ec6e4f88701da8a5398b8477887154691864 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1355947627574453912.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/framework_utils/tensorflow/feature_column_utils.py
Oh no! 💥 💔 💥
1 file would be reformatted, 74 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7344957662496249490.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit aad5acc0129f0d64b78c1a89716a28ed7d9905eb, no merge conflicts.
Running as SYSTEM
Setting status of aad5acc0129f0d64b78c1a89716a28ed7d9905eb to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1037/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse aad5acc0129f0d64b78c1a89716a28ed7d9905eb^{commit} # timeout=10
Checking out Revision aad5acc0129f0d64b78c1a89716a28ed7d9905eb (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f aad5acc0129f0d64b78c1a89716a28ed7d9905eb # timeout=10
Commit message: "changing bucketize to geq and updating notebook"
 > git rev-list --no-walk 5b8cdf05aefb51e9daba771493c791144df7adc2 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1704038617127028868.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/framework_utils/tensorflow/feature_column_utils.py
Oh no! 💥 💔 💥
1 file would be reformatted, 74 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7467620612561454385.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 4319ccdf48ded3767f6ac4fffec718852f8b001a, no merge conflicts.
Running as SYSTEM
Setting status of 4319ccdf48ded3767f6ac4fffec718852f8b001a to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1038/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4319ccdf48ded3767f6ac4fffec718852f8b001a^{commit} # timeout=10
Checking out Revision 4319ccdf48ded3767f6ac4fffec718852f8b001a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4319ccdf48ded3767f6ac4fffec718852f8b001a # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk aad5acc0129f0d64b78c1a89716a28ed7d9905eb # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6125076540496084307.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35785 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c700bd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc97410c1d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc97410c1d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc95c782090>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc95c782090>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c782090>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc974048b50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc97408d2d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc97408d2d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc95c745e90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc95c745e90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c745e90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3163 383 1326 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.52%
================ 553 passed, 212 warnings in 449.78s (0:07:29) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins1331363075598893432.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 84f3cb21f4da326bb12fbd4bcd99629b28393c43, no merge conflicts.
Running as SYSTEM
Setting status of 84f3cb21f4da326bb12fbd4bcd99629b28393c43 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1043/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 84f3cb21f4da326bb12fbd4bcd99629b28393c43^{commit} # timeout=10
Checking out Revision 84f3cb21f4da326bb12fbd4bcd99629b28393c43 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 84f3cb21f4da326bb12fbd4bcd99629b28393c43 # timeout=10
Commit message: "writing updates to notebook"
 > git rev-list --no-walk cd693d71c1641e70ee5c2df0c20606b5bff45965 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3304519511972128849.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 32815 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db82c5650>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db0465390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db0465390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db827e850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db827e850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db827e850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db0525d50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db03833d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db03833d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db03f1590>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db03f1590>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db03f1590>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38520 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40032 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3163 383 1326 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.52%
================ 553 passed, 212 warnings in 843.47s (0:14:03) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins3783604421021780500.sh

Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thanks for this

Comment on lines +53 to +54
for column in columns:
val ^= gdf[column].hash_values() # or however we want to do this aggregation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our categorify op lets you pass in column groups, and takes an 'encode_type' parameter - which if set to 'combo' does the categorical encoding on the cross like this op:

https://github.com/NVIDIA/NVTabular/blob/f39e65e95d0af1d44ae9c2073a06c5a442d4de93/nvtabular/ops/categorify.py#L76-L81

What do you think about rolling the functionality for this op into the HashBucket op to be consistent? (like if passed a multi column group, then doing the cross).

Ronay is working on merging the HashBucket functionality with the categorify op - and this is one of the things that I think we could do to minimize the delta in functionality between the two

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm that's an interesting idea... I think that could make sense. I'll need to think more about it from a conceptual standpoint but they definitely seem equivalent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the meeting today, do we want to shelve this until after the 0.3 release? While I agree that this functionality should all be wrapped together, the API standpoint feels non-trivial and at this point it might make more sense just to get this in for TF users (especially if it get primarily used by the make_feature_column_workflow function, since any API changes will be handled on the backend and wouldn't require users to update their code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good - we can always come back to this

nvtabular/ops/hashed_cross.py Outdated Show resolved Hide resolved
nvtabular/loader/tensorflow.py Outdated Show resolved Hide resolved
@benfred benfred changed the title [REVIEW] Adding ops for feature column functionality and feature column to workflow mapping function Adding ops for feature column functionality and feature column to workflow mapping function Nov 2, 2020
@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 4c7c31b76564d87ba70db6789cbef9220779962b, has merge conflicts.
Running as SYSTEM
!!! PR mergeability status has changed !!!  
PR now has NO merge conflicts
Setting status of 4c7c31b76564d87ba70db6789cbef9220779962b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1105/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4c7c31b76564d87ba70db6789cbef9220779962b^{commit} # timeout=10
Checking out Revision 4c7c31b76564d87ba70db6789cbef9220779962b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4c7c31b76564d87ba70db6789cbef9220779962b # timeout=10
Commit message: "Apply suggestions from code review"
 > git rev-list --no-walk 6f95dcd651e6270c9a4cface9b0c88c9198e78e6 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins8405920878602430271.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 9%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 19%]
............................... [ 25%]
tests/unit/test_notebooks.py .... [ 26%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 71%]
................................ [ 77%]
tests/unit/test_torch_dataloader.py ............................ [ 82%]
tests/unit/test_workflow.py ............................................ [ 90%]
....................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 12 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_io.py::test_parquet_lists[0]
tests/unit/test_io.py::test_parquet_lists[1]
tests/unit/test_io.py::test_parquet_lists[2]
tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 39465 instead
http_address["port"], self.http_server.port

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe824eed0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe80a1d10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe80a1d10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe809db90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe809db90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe809db90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe810cc50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe07d0190>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe07d0190>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe07d0d90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe07d0d90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe07d0d90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 41256 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 174 4 58 4 97% 136->137, 137, 208->211, 211-213, 250->252, 258->263
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 11 45 3 90% 30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 73->74, 74
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 423 38 234 24 89% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911
setup.py 2 2 0 0 0% 18-20

TOTAL 3163 383 1326 173 85%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.52%
================ 553 passed, 212 warnings in 449.41s (0:07:29) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins6321422370962802906.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 31fa9f93f53687ffb2d51486426ba35012e76326, no merge conflicts.
Running as SYSTEM
Setting status of 31fa9f93f53687ffb2d51486426ba35012e76326 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1106/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 31fa9f93f53687ffb2d51486426ba35012e76326^{commit} # timeout=10
Checking out Revision 31fa9f93f53687ffb2d51486426ba35012e76326 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 31fa9f93f53687ffb2d51486426ba35012e76326 # timeout=10
Commit message: "Merge branch 'main' into fc_matching"
 > git rev-list --no-walk 4c7c31b76564d87ba70db6789cbef9220779962b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins5999892585943008508.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
1 file would be reformatted, 75 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins6150452337499400679.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a, no merge conflicts.
Running as SYSTEM
Setting status of bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1107/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a^{commit} # timeout=10
Checking out Revision bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a # timeout=10
Commit message: "black"
 > git rev-list --no-walk 31fa9f93f53687ffb2d51486426ba35012e76326 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1123802092085219619.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 582 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 27%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 36%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
................................ [ 76%]
tests/unit/test_torch_dataloader.py ............................ [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
................................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 15 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46027 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58276490>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f581d63d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f581d63d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58253e10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f5823b510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f5823b510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 41256 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 182->184
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 448 16 248 23 94% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970
setup.py 2 2 0 0 0% 18-20

TOTAL 3282 440 1370 169 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.49%
=========== 574 passed, 8 skipped, 212 warnings in 459.37s (0:07:39) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5949652981608776205.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235, no merge conflicts.
Running as SYSTEM
Setting status of 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1109/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235^{commit} # timeout=10
Checking out Revision 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10
Commit message: "Merge branch 'main' into fc_matching"
 > git rev-list --no-walk a15e8ac0e9d703cd6676685ed23d18229ae5d171 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1313423597887978786.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 582 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 27%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 36%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py FFFFFFFFFFFF [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
................................ [ 76%]
tests/unit/test_torch_dataloader.py ......FF..FF..FFFFFFFFFFFFFF [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
................................................................... [100%]

=================================== FAILURES ===================================
_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd33852ddd0>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0f290>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
----------------------------- Captured stderr call -----------------------------
2020-11-03 18:13:39.416661: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-03 18:13:39.439267: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz
2020-11-03 18:13:39.440344: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca35f25450 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-03 18:13:39.440403: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-03 18:13:39.757543: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca35ff6990 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-03 18:13:39.757587: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-03 18:13:39.757597: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-03 18:13:39.757605: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-03 18:13:39.757612: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-03 18:13:39.760091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-03 18:13:39.761262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-03 18:13:39.762422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:
pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-03 18:13:39.763476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:
pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-03 18:13:39.763567: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-03 18:13:39.763608: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-03 18:13:39.763635: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-03 18:13:39.763660: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-03 18:13:39.763682: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-03 18:13:39.763705: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-03 18:13:39.763729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-03 18:13:39.773326: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3
2020-11-03 18:13:39.773405: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-03 18:13:39.779189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-03 18:13:39.779213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 1 2 3
2020-11-03 18:13:39.779223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N Y Y Y
2020-11-03 18:13:39.779258: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1: Y N Y Y
2020-11-03 18:13:39.779269: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2: Y Y N Y
2020-11-03 18:13:39.779276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3: Y Y Y N
2020-11-03 18:13:39.784611: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)
2020-11-03 18:13:39.786085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15212 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)
2020-11-03 18:13:39.787529: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15212 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)
2020-11-03 18:13:39.788961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15212 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)
_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd32409ce90>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e5d40>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f86bc710>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87f87a0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd30819d110>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd30819f320>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3241809d0>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a5d8c0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f851a6d0>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e4d0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3240ff2d0>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3249ff320>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3081b4850>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3249ff710>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f84e8b10>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e4d0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3240494d0>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87f2950>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd32416e950>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3081b8950>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3241167d0>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:60:


nvtabular/workflow.py:846: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e170>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name0-cont_names0-cat_names0-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd30813a3d0>
engine = 'parquet', cat_names = ['name-cat', 'name-string']
cont_names = ['x', 'y', 'id'], label_name = ['label']

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_format=None,
    )

tests/unit/test_torch_dataloader.py:70:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f863e680>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name0-cont_names0-cat_names1-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co1')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29016d1d0>
engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']
label_name = ['label']

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_format=None,
    )

tests/unit/test_torch_dataloader.py:70:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f85540e0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name1-cont_names0-cat_names0-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f85cae90>
engine = 'parquet', cat_names = ['name-cat', 'name-string']
cont_names = ['x', 'y', 'id'], label_name = []

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_format=None,
    )

tests/unit/test_torch_dataloader.py:70:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f85c1d40>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name1-cont_names0-cat_names1-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co1')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29031c550>
engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']
label_name = []

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_format=None,
    )

tests/unit/test_torch_dataloader.py:70:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8374170>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_1e_0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f856ab90>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2906b6cb0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_0_00')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290590f50>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd290236c20>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_1e0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290556e90>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd290683ef0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_0_0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd324115d90>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f84078c0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_10')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290535b90>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2902368c0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-100-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_00')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2901c28d0>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3080960e0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_10')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29023fe10>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8117710>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_11')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29010aad0>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd29055e290>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_12')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f8551050>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e80e0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_13')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2905de210>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2902368c0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_14')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f8637610>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'
devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8232dd0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_15')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f85c3fd0>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:112:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f848aef0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________________________ test_kill_dl[parquet-1e-06] __________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_1e_06_0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290559a90>
part_mem_fraction = 1e-06, engine = 'parquet'

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_path=output_train,
    )

tests/unit/test_torch_dataloader.py:183:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e85f0>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
__________________________ test_kill_dl[parquet-0.1] ___________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_0_1_0')
df = name-cat name-string id label x y
0 Charlie Edith 1024 1054 0.763430 -0.231628
...ay 929 972 -0.219996 -0.200242
2160 Ursula Ray 1045 979 -0.482353 0.136629

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290391a90>
part_mem_fraction = 0.1, engine = 'parquet'

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_path=output_train,
    )

tests/unit/test_torch_dataloader.py:183:


nvtabular/workflow.py:784: in apply
dtypes=dtypes,
nvtabular/workflow.py:887: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:635: in exec_phase
_ddf = self._aggregated_dask_transform(_ddf, transforms)
nvtabular/workflow.py:604: in _aggregated_dask_transform
meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:82: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f81e7560>
fill_value = 999.5

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 11 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35845 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd290235850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f843f250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f843f250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f843f750>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f843f750>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd2f843f750>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd29056d1d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f8753b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f8753b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd308157990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd308157990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd308157990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 3 40 3 96% 54->55, 55-59, 87->89, 89, 182->184
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 25 60 10 84% 69->70, 70, 75-76, 104->105, 105, 118->119, 119, 133->134, 134, 139->140, 140, 144-145, 149-153, 156, 223->225, 225, 230->231, 231-235, 246->247, 247, 269->270, 270-271, 300->307, 307-308, 316-317
nvtabular/loader/tensorflow.py 110 22 48 12 76% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 274->275, 275, 281->282, 282, 288-291, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 4 10 3 80% 66->67, 67, 107->108, 108, 111->114, 114-115
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 448 16 248 23 94% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970
setup.py 2 2 0 0 0% 18-20

TOTAL 3282 466 1370 177 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 82.59%
=========================== short test summary info ============================
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names0-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names0-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06] - Typ...
FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-0.1] - TypeE...
===== 30 failed, 544 passed, 8 skipped, 201 warnings in 333.81s (0:05:33) ======
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins3484142754550874042.sh

@benfred
Copy link
Member

benfred commented Nov 3, 2020

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #379 of commit 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235, no merge conflicts.
Running as SYSTEM
Setting status of 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1110/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235^{commit} # timeout=10
Checking out Revision 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10
Commit message: "Merge branch 'main' into fc_matching"
 > git rev-list --no-walk 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins3193439717990562499.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 582 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 27%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 36%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
................................ [ 76%]
tests/unit/test_torch_dataloader.py ............................ [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
................................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 15 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44823 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5a6590>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5a9950>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5a9950>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5df110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c636510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c636510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38520 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40032 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 182->184
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 448 16 248 23 94% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970
setup.py 2 2 0 0 0% 18-20

TOTAL 3282 440 1370 169 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.49%
=========== 574 passed, 8 skipped, 212 warnings in 455.61s (0:07:35) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins6301827228761798190.sh

@benfred benfred merged commit ca38bad into NVIDIA-Merlin:main Nov 3, 2020
@alecgunny alecgunny mentioned this pull request Nov 3, 2020
# boundaries and embedding dim so that we can wrap
# with either indicator or embedding later
if key in [col.key for col in numeric_columns]:
buckets[key] = (column.boundaries, embedding_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe here and two lines below it should be cat_column.boundaries. It fails to find the boundaries attributes in my attempts to use this utility, and it passes if I replace column.boundaries with cat_column.boundaries.

mikemckiernan pushed a commit that referenced this pull request Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants