Adding ops for feature column functionality and feature column to workflow mapping function #379

alecgunny · 2020-10-23T17:47:56Z

Increasing NVTabular compatibility with TensorFlow feature column API by adding remaining necessary ops (cross op and bucketize) and a function which can map from a set of feature columns to an NVTabular workflow that performs all analogous preprocessing. Addresses #371

HashedCross doesn't support multi-hot yet, and I'm not sure that extending to it will be necessarily easy. For reference, the TF cross op handles multi-hots by doing a cartesian product of all indices for each feature. See the documentation here.

Still need to add bucketized support and test everything.

nvidia-merlin-bot · 2020-10-23T17:48:16Z

Click to view CI Results

GitHub pull request #379 of commit bd068b8168424c4a151775173c529db7c07c6720, no merge conflicts.
Running as SYSTEM
Setting status of bd068b8168424c4a151775173c529db7c07c6720 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1005/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse bd068b8168424c4a151775173c529db7c07c6720^{commit} # timeout=10
Checking out Revision bd068b8168424c4a151775173c529db7c07c6720 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f bd068b8168424c4a151775173c529db7c07c6720 # timeout=10
Commit message: "adding ops and feature column utils"
 > git rev-list --no-walk 171491a2233ecaa82788cf026f779e0c39e8b87a # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7714710515846326646.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins2911655224900517134.sh

nvidia-merlin-bot · 2020-10-23T18:10:26Z

Click to view CI Results

GitHub pull request #379 of commit 951b025ce20f4a29d0949d6223a9f93cee0dc820, no merge conflicts.
Running as SYSTEM
Setting status of 951b025ce20f4a29d0949d6223a9f93cee0dc820 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1006/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 951b025ce20f4a29d0949d6223a9f93cee0dc820^{commit} # timeout=10
Checking out Revision 951b025ce20f4a29d0949d6223a9f93cee0dc820 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 951b025ce20f4a29d0949d6223a9f93cee0dc820 # timeout=10
Commit message: "importing hashed cross in ops"
 > git rev-list --no-walk bd068b8168424c4a151775173c529db7c07c6720 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins3759065021120467545.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins61563149220576116.sh

nvidia-merlin-bot · 2020-10-23T18:32:31Z

Click to view CI Results

GitHub pull request #379 of commit 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5, no merge conflicts.
Running as SYSTEM
Setting status of 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1007/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5^{commit} # timeout=10
Checking out Revision 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 70c3a4802849ae14d14bf32c1bdbf73a60ab15b5 # timeout=10
Commit message: "switching to xor"
 > git rev-list --no-walk 951b025ce20f4a29d0949d6223a9f93cee0dc820 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6439279322316503336.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3780925671695674544.sh

nvidia-merlin-bot · 2020-10-23T19:02:57Z

Click to view CI Results

GitHub pull request #379 of commit 319d475f2526c3d95968e1d09476b036d2d3e0d1, no merge conflicts.
Running as SYSTEM
Setting status of 319d475f2526c3d95968e1d09476b036d2d3e0d1 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1009/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 319d475f2526c3d95968e1d09476b036d2d3e0d1^{commit} # timeout=10
Checking out Revision 319d475f2526c3d95968e1d09476b036d2d3e0d1 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 319d475f2526c3d95968e1d09476b036d2d3e0d1 # timeout=10
Commit message: "hashed cross and workflow builder working"
 > git rev-list --no-walk f5a6ddd36454d7f0c19634070c801af0597e3b9f # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2894470993341789209.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
1 file would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8960642179791915675.sh

alecgunny · 2020-10-23T19:10:12Z

@benfred what's our stance on adding ops without list support? If we're ok with it, should we add a support matrix in the documentation?

alecgunny · 2020-10-23T19:31:02Z

With the addition of bucketize, we should have full TF feature column coverage (minus the sequence columns which I won't worry about for now). The shared embeddings and weighted shared embeddings are more of Keras layers than they would be preprocessing steps, so will still need to add layers that cover those. But overall this should put us in pretty good shape. Just need to build tests, docs, and an example and we should be good to go

nvidia-merlin-bot · 2020-10-23T19:37:10Z

Click to view CI Results

GitHub pull request #379 of commit 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed, no merge conflicts.
Running as SYSTEM
Setting status of 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1011/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed^{commit} # timeout=10
Checking out Revision 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed # timeout=10
Commit message: "adding bucketize"
 > git rev-list --no-walk 010157a6e70d28c90c508e0b3430fc2a76a6cd14 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins6823510643244802508.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
Oh no! 💥 💔 💥
2 files would be reformatted, 73 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8538873942078219243.sh

nvidia-merlin-bot · 2020-10-23T19:54:27Z

Click to view CI Results

GitHub pull request #379 of commit d2f1efd8ae32c71cca4daa954b66a56f3b6ca126, no merge conflicts.
Running as SYSTEM
Setting status of d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1012/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse d2f1efd8ae32c71cca4daa954b66a56f3b6ca126^{commit} # timeout=10
Checking out Revision d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 # timeout=10
Commit message: "adding op tests"
 > git rev-list --no-walk 46abb5b6a970b9ea383964cf5eeb26eaa2fb3fed # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins9117988078583547627.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
3 files would be reformatted, 72 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3583608757503062890.sh

nvidia-merlin-bot · 2020-10-23T19:56:26Z

Click to view CI Results

GitHub pull request #379 of commit 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8, no merge conflicts.
Running as SYSTEM
Setting status of 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1013/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8^{commit} # timeout=10
Checking out Revision 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk d2f1efd8ae32c71cca4daa954b66a56f3b6ca126 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6095631252068171384.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
4 files would be reformatted, 71 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins156390319430102341.sh

nvidia-merlin-bot · 2020-10-23T20:02:32Z

Click to view CI Results

GitHub pull request #379 of commit ebf234766dc5efb29c5d858f27eff061ca267c7f, no merge conflicts.
Running as SYSTEM
Setting status of ebf234766dc5efb29c5d858f27eff061ca267c7f to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1014/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse ebf234766dc5efb29c5d858f27eff061ca267c7f^{commit} # timeout=10
Checking out Revision ebf234766dc5efb29c5d858f27eff061ca267c7f (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ebf234766dc5efb29c5d858f27eff061ca267c7f # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk 026d2fe6b4d63a7ca74b965a5ffaa52187b78fe8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7642891004762184407.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/benchmarks/test_notebooks.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
5 files would be reformatted, 70 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1940264879864262767.sh

nvidia-merlin-bot · 2020-10-23T20:09:17Z

Click to view CI Results

GitHub pull request #379 of commit 66419f4267583bbdbb4302757fd016c1a88efd94, no merge conflicts.
Running as SYSTEM
Setting status of 66419f4267583bbdbb4302757fd016c1a88efd94 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1015/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 66419f4267583bbdbb4302757fd016c1a88efd94^{commit} # timeout=10
Checking out Revision 66419f4267583bbdbb4302757fd016c1a88efd94 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 66419f4267583bbdbb4302757fd016c1a88efd94 # timeout=10
Commit message: "documenting"
 > git rev-list --no-walk ebf234766dc5efb29c5d858f27eff061ca267c7f # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins5972829207961115146.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/bucketize.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/benchmarks/test_notebooks.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/hashed_cross.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_io.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/categorify.py
Oh no! 💥 💔 💥
5 files would be reformatted, 70 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins4954172438073774049.sh

nvidia-merlin-bot · 2020-10-23T20:11:17Z

Click to view CI Results

GitHub pull request #379 of commit 1446a6407a3c1d468eba5430b919e13c23f49771, no merge conflicts.
Running as SYSTEM
Setting status of 1446a6407a3c1d468eba5430b919e13c23f49771 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1016/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 1446a6407a3c1d468eba5430b919e13c23f49771^{commit} # timeout=10
Checking out Revision 1446a6407a3c1d468eba5430b919e13c23f49771 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 1446a6407a3c1d468eba5430b919e13c23f49771 # timeout=10
Commit message: "updated blackening"
 > git rev-list --no-walk 66419f4267583bbdbb4302757fd016c1a88efd94 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7595550585448599460.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/framework_utils/tensorflow/__init__.py:17:1: F401 '.feature_column_utils.make_feature_column_workflow' imported but unused
./nvtabular/ops/hashed_cross.py:17:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/hashed_cross.py:20:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:18:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/bucketize.py:21:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:29:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7414266527318070797.sh

nvidia-merlin-bot · 2020-10-23T20:13:07Z

Click to view CI Results

GitHub pull request #379 of commit 8b711512cfca966e3b5dfb6c7b4560aa353d97e8, no merge conflicts.
Running as SYSTEM
Setting status of 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1017/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 8b711512cfca966e3b5dfb6c7b4560aa353d97e8^{commit} # timeout=10
Checking out Revision 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 1446a6407a3c1d468eba5430b919e13c23f49771 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2182059037975933060.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/ops/bucketize.py:18:1: F401 'cudf.utils.dtypes.is_list_dtype' imported but unused
./nvtabular/ops/bucketize.py:21:1: F401 '.categorify._encode_list_column' imported but unused
./nvtabular/ops/bucketize.py:29:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins4688283507400897510.sh

nvidia-merlin-bot · 2020-10-23T20:13:47Z

Click to view CI Results

GitHub pull request #379 of commit 4afd77285818b2e75637fcbc59024793d41e311b, no merge conflicts.
Running as SYSTEM
Setting status of 4afd77285818b2e75637fcbc59024793d41e311b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1018/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4afd77285818b2e75637fcbc59024793d41e311b^{commit} # timeout=10
Checking out Revision 4afd77285818b2e75637fcbc59024793d41e311b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4afd77285818b2e75637fcbc59024793d41e311b # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 8b711512cfca966e3b5dfb6c7b4560aa353d97e8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins8789470020834365298.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/framework_utils/tensorflow/feature_column_utils.py:3:1: F401 'yaml' imported but unused
./nvtabular/ops/bucketize.py:30:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7235389498370303335.sh

nvidia-merlin-bot · 2020-10-23T20:14:17Z

Click to view CI Results

GitHub pull request #379 of commit 5a22fb406220010733c1bd3221d2542ad483a0bd, no merge conflicts.
Running as SYSTEM
Setting status of 5a22fb406220010733c1bd3221d2542ad483a0bd to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1019/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5a22fb406220010733c1bd3221d2542ad483a0bd^{commit} # timeout=10
Checking out Revision 5a22fb406220010733c1bd3221d2542ad483a0bd (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5a22fb406220010733c1bd3221d2542ad483a0bd # timeout=10
Commit message: "fixing formatting issues"
 > git rev-list --no-walk 4afd77285818b2e75637fcbc59024793d41e311b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins4031815495685499956.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
./tests/unit/test_ops.py:1030:24: F821 undefined name 'op'
./nvtabular/ops/bucketize.py:30:18: F821 undefined name 'CONT'
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins3387745480078041911.sh

alecgunny · 2020-10-23T20:17:32Z

rerun tests

nvidia-merlin-bot · 2020-10-26T21:17:04Z

Click to view CI Results

GitHub pull request #379 of commit 5c39ed49cd67ae3d969da6c4be3889d3f871f6de, no merge conflicts.
Running as SYSTEM
Setting status of 5c39ed49cd67ae3d969da6c4be3889d3f871f6de to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1033/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5c39ed49cd67ae3d969da6c4be3889d3f871f6de^{commit} # timeout=10
Checking out Revision 5c39ed49cd67ae3d969da6c4be3889d3f871f6de (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5c39ed49cd67ae3d969da6c4be3889d3f871f6de # timeout=10
Commit message: "fixing bucketization to workf properly"
 > git rev-list --no-walk 00ecf2c908a7070e8bd5d3929ce6f422c0d12200 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1110005061003603298.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 36463 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8ac437ef10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a486d3510>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a486d3510>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a881542d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a881542d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a881542d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a484ed410>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a8809d9d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a8809d9d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8a487eabd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8a487eabd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8a487eabd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 41256 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     110    103     75      0     4%   45-206

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3148    369   1320    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.89%

================ 553 passed, 212 warnings in 453.28s (0:07:33) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins8860082276376206715.sh

nvidia-merlin-bot · 2020-10-26T23:22:47Z

Click to view CI Results

GitHub pull request #379 of commit 03a50634faac5b72f5ceb3be7158ffbf61794ed4, no merge conflicts.
Running as SYSTEM
Setting status of 03a50634faac5b72f5ceb3be7158ffbf61794ed4 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1034/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 03a50634faac5b72f5ceb3be7158ffbf61794ed4^{commit} # timeout=10
Checking out Revision 03a50634faac5b72f5ceb3be7158ffbf61794ed4 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 03a50634faac5b72f5ceb3be7158ffbf61794ed4 # timeout=10
Commit message: "fixing bucketized behavior"
 > git rev-list --no-walk 5c39ed49cd67ae3d969da6c4be3889d3f871f6de # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1988088860742501560.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 46509 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f77f1e89990>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77f76faa90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77f76faa90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f78bc7c84d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f78bc7c84d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f78bc7c84d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f78bc86c750>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77dc371450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77dc371450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f77c4395fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f77c4395fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f77c4395fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     121    113     81      0     4%   12-16, 53-249

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3159    379   1326    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.59%

================ 553 passed, 212 warnings in 460.88s (0:07:40) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins6667586222959812840.sh

nvidia-merlin-bot · 2020-10-26T23:31:03Z

Click to view CI Results

GitHub pull request #379 of commit 7cc9ec6e4f88701da8a5398b8477887154691864, no merge conflicts.
Running as SYSTEM
Setting status of 7cc9ec6e4f88701da8a5398b8477887154691864 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1035/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 7cc9ec6e4f88701da8a5398b8477887154691864^{commit} # timeout=10
Checking out Revision 7cc9ec6e4f88701da8a5398b8477887154691864 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 7cc9ec6e4f88701da8a5398b8477887154691864 # timeout=10
Commit message: "fixing some bucket stuff"
 > git rev-list --no-walk 03a50634faac5b72f5ceb3be7158ffbf61794ed4 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1794544947207974246.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44441 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f42705d8250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f42705e8450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f42705e8450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f427054e290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f427054e290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f427054e290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f427055c890>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f427055c850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f427055c850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f4270563cd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f4270563cd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f4270563cd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38520 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39744 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40032 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-253

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3163    383   1326    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.52%

================ 553 passed, 212 warnings in 452.05s (0:07:32) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins8405706607108263696.sh

nvidia-merlin-bot · 2020-10-26T23:40:19Z

Click to view CI Results

GitHub pull request #379 of commit 5b8cdf05aefb51e9daba771493c791144df7adc2, no merge conflicts.
Running as SYSTEM
Setting status of 5b8cdf05aefb51e9daba771493c791144df7adc2 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1036/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 5b8cdf05aefb51e9daba771493c791144df7adc2^{commit} # timeout=10
Checking out Revision 5b8cdf05aefb51e9daba771493c791144df7adc2 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5b8cdf05aefb51e9daba771493c791144df7adc2 # timeout=10
Commit message: "changing preprocess and features in feature column utils"
 > git rev-list --no-walk 7cc9ec6e4f88701da8a5398b8477887154691864 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1355947627574453912.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/framework_utils/tensorflow/feature_column_utils.py
Oh no! 💥 💔 💥
1 file would be reformatted, 74 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7344957662496249490.sh

nvidia-merlin-bot · 2020-10-27T00:11:19Z

Click to view CI Results

GitHub pull request #379 of commit aad5acc0129f0d64b78c1a89716a28ed7d9905eb, no merge conflicts.
Running as SYSTEM
Setting status of aad5acc0129f0d64b78c1a89716a28ed7d9905eb to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1037/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse aad5acc0129f0d64b78c1a89716a28ed7d9905eb^{commit} # timeout=10
Checking out Revision aad5acc0129f0d64b78c1a89716a28ed7d9905eb (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f aad5acc0129f0d64b78c1a89716a28ed7d9905eb # timeout=10
Commit message: "changing bucketize to geq and updating notebook"
 > git rev-list --no-walk 5b8cdf05aefb51e9daba771493c791144df7adc2 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1704038617127028868.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/framework_utils/tensorflow/feature_column_utils.py
Oh no! 💥 💔 💥
1 file would be reformatted, 74 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7467620612561454385.sh

nvidia-merlin-bot · 2020-10-27T00:20:36Z

Click to view CI Results

GitHub pull request #379 of commit 4319ccdf48ded3767f6ac4fffec718852f8b001a, no merge conflicts.
Running as SYSTEM
Setting status of 4319ccdf48ded3767f6ac4fffec718852f8b001a to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1038/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4319ccdf48ded3767f6ac4fffec718852f8b001a^{commit} # timeout=10
Checking out Revision 4319ccdf48ded3767f6ac4fffec718852f8b001a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4319ccdf48ded3767f6ac4fffec718852f8b001a # timeout=10
Commit message: "blackening"
 > git rev-list --no-walk aad5acc0129f0d64b78c1a89716a28ed7d9905eb # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6125076540496084307.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35785 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c700bd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc97410c1d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc97410c1d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc95c782090>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc95c782090>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c782090>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc974048b50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc97408d2d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc97408d2d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fc95c745e90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fc95c745e90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fc95c745e90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3163    383   1326    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.52%

================ 553 passed, 212 warnings in 449.78s (0:07:29) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins1331363075598893432.sh

nvidia-merlin-bot · 2020-10-27T17:03:54Z

Click to view CI Results

GitHub pull request #379 of commit 84f3cb21f4da326bb12fbd4bcd99629b28393c43, no merge conflicts.
Running as SYSTEM
Setting status of 84f3cb21f4da326bb12fbd4bcd99629b28393c43 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1043/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 84f3cb21f4da326bb12fbd4bcd99629b28393c43^{commit} # timeout=10
Checking out Revision 84f3cb21f4da326bb12fbd4bcd99629b28393c43 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 84f3cb21f4da326bb12fbd4bcd99629b28393c43 # timeout=10
Commit message: "writing updates to notebook"
 > git rev-list --no-walk cd693d71c1641e70ee5c2df0c20606b5bff45965 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3304519511972128849.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 32815 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db82c5650>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db0465390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db0465390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db827e850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db827e850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db827e850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db0525d50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db03833d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db03833d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0db03f1590>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0db03f1590>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0db03f1590>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 36504 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38520 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39744 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40212 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 40032 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 335->338, 338

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3163    383   1326    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.52%

================ 553 passed, 212 warnings in 843.47s (0:14:03) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins3783604421021780500.sh

benfred

This is awesome! Thanks for this

benfred · 2020-10-30T04:34:21Z

nvtabular/ops/hashed_cross.py

+            for column in columns:
+                val ^= gdf[column].hash_values()  # or however we want to do this aggregation


Our categorify op lets you pass in column groups, and takes an 'encode_type' parameter - which if set to 'combo' does the categorical encoding on the cross like this op:

https://github.com/NVIDIA/NVTabular/blob/f39e65e95d0af1d44ae9c2073a06c5a442d4de93/nvtabular/ops/categorify.py#L76-L81

What do you think about rolling the functionality for this op into the HashBucket op to be consistent? (like if passed a multi column group, then doing the cross).

Ronay is working on merging the HashBucket functionality with the categorify op - and this is one of the things that I think we could do to minimize the delta in functionality between the two

Hmmm that's an interesting idea... I think that could make sense. I'll need to think more about it from a conceptual standpoint but they definitely seem equivalent

Based on the meeting today, do we want to shelve this until after the 0.3 release? While I agree that this functionality should all be wrapped together, the API standpoint feels non-trivial and at this point it might make more sense just to get this in for TF users (especially if it get primarily used by the make_feature_column_workflow function, since any API changes will be handled on the backend and wouldn't require users to update their code)

sounds good - we can always come back to this

nvtabular/ops/hashed_cross.py

nvtabular/loader/tensorflow.py

nvidia-merlin-bot · 2020-11-03T17:53:22Z

Click to view CI Results

GitHub pull request #379 of commit 4c7c31b76564d87ba70db6789cbef9220779962b, has merge conflicts.
Running as SYSTEM
!!! PR mergeability status has changed !!!  
PR now has NO merge conflicts
Setting status of 4c7c31b76564d87ba70db6789cbef9220779962b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1105/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 4c7c31b76564d87ba70db6789cbef9220779962b^{commit} # timeout=10
Checking out Revision 4c7c31b76564d87ba70db6789cbef9220779962b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4c7c31b76564d87ba70db6789cbef9220779962b # timeout=10
Commit message: "Apply suggestions from code review"
 > git rev-list --no-walk 6f95dcd651e6270c9a4cface9b0c88c9198e78e6 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins8405920878602430271.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
75 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 553 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  9%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 19%]

...............................                                          [ 25%]

tests/unit/test_notebooks.py ....                                        [ 26%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py ............                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 71%]

................................                                         [ 77%]

tests/unit/test_torch_dataloader.py ............................         [ 82%]

tests/unit/test_workflow.py ............................................ [ 90%]

.......................................................                  [100%]
=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 12 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_io.py::test_parquet_lists[0]

tests/unit/test_io.py::test_parquet_lists[1]

tests/unit/test_io.py::test_parquet_lists[2]

tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 39465 instead

http_address["port"], self.http_server.port
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe824eed0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe80a1d10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe80a1d10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe809db90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe809db90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe809db90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe810cc50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe07d0190>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe07d0190>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7febe07d0d90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7febe07d0d90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7febe07d0d90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 41256 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 39240 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37548 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 37728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 38880 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:72: UserWarning: Row group size 77760 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                           99      9     46      8    88%   190->191, 191, 203->204, 204, 212->213, 213, 221->233, 226->231, 231-233, 308->309, 309, 323->324, 324-325, 343->344, 344

nvtabular/io/dataset_engine.py                                    12      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          174      4     58      4    97%   136->137, 137, 208->211, 211-213, 250->252, 258->263

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123     11     45      3    90%   30, 47, 71->72, 72, 110, 113, 126->127, 127-128, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188      8     60      5    95%   69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271

nvtabular/loader/tensorflow.py                                   110     17     48     11    81%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   21      1      4      1    92%   73->74, 74

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          33      1      2      0    97%   60

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            423     38    234     24    89%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 482->483, 483, 501->503, 503-512, 523->522, 572->577, 577, 580->581, 581, 616->617, 617, 666->657, 732->743, 743, 765-795, 822->823, 823, 836->839, 869->870, 870-872, 876->877, 877, 910->911, 911

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3163    383   1326    173    85%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.52%

================ 553 passed, 212 warnings in 449.41s (0:07:29) =================

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins6321422370962802906.sh

nvidia-merlin-bot · 2020-11-03T17:53:36Z

Click to view CI Results

GitHub pull request #379 of commit 31fa9f93f53687ffb2d51486426ba35012e76326, no merge conflicts.
Running as SYSTEM
Setting status of 31fa9f93f53687ffb2d51486426ba35012e76326 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1106/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 31fa9f93f53687ffb2d51486426ba35012e76326^{commit} # timeout=10
Checking out Revision 31fa9f93f53687ffb2d51486426ba35012e76326 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 31fa9f93f53687ffb2d51486426ba35012e76326 # timeout=10
Commit message: "Merge branch 'main' into fc_matching"
 > git rev-list --no-walk 4c7c31b76564d87ba70db6789cbef9220779962b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins5999892585943008508.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
1 file would be reformatted, 75 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins6150452337499400679.sh

nvidia-merlin-bot · 2020-11-03T18:07:22Z

Click to view CI Results

GitHub pull request #379 of commit bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a, no merge conflicts. Running as SYSTEM Setting status of bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1107/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10 > git rev-parse bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a^{commit} # timeout=10 Checking out Revision bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f bd6ec9f139f38f33bb47c0d4d93725ea5f56c33a # timeout=10 Commit message: "black" > git rev-list --no-walk 31fa9f93f53687ffb2d51486426ba35012e76326 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins1123802092085219619.sh Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: finished with status 'done' Preparing wheel metadata: started Preparing wheel metadata: finished with status 'done' Installing collected packages: nvtabular Attempting uninstall: nvtabular Found existing installation: nvtabular 0.2.0 Uninstalling nvtabular-0.2.0: Successfully uninstalled nvtabular-0.2.0 Running setup.py develop for nvtabular Successfully installed nvtabular All done! ✨ 🍰 ✨ 76 files would be left unchanged. /var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images warn(f"Likely recursive symlink detected to {resolved_path}") Skipped 1 files ============================= test session starts ============================== platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0 collected 582 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 27%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 36%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
................................ [ 76%]
tests/unit/test_torch_dataloader.py ............................ [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
................................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 15 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46027 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58276490>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f581d63d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f581d63d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f581d0990>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58253e10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f5823b510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f5823b510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f58238910>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 41256 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39240 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 37728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 182->184
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 448 16 248 23 94% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970
setup.py 2 2 0 0 0% 18-20

TOTAL 3282 440 1370 169 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.49%
=========== 574 passed, 8 skipped, 212 warnings in 459.37s (0:07:39) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5949652981608776205.sh

nvidia-merlin-bot · 2020-11-03T18:16:02Z

Click to view CI Results

GitHub pull request #379 of commit 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235, no merge conflicts.
Running as SYSTEM
Setting status of 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1109/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10
 > git rev-parse 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235^{commit} # timeout=10
Checking out Revision 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10
Commit message: "Merge branch 'main' into fc_matching"
 > git rev-list --no-walk a15e8ac0e9d703cd6676685ed23d18229ae5d171 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1313423597887978786.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 582 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 27%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 36%]

........................................................................ [ 48%]

.......................................................................  [ 60%]

tests/unit/test_s3.py ..                                                 [ 61%]

tests/unit/test_tf_dataloader.py FFFFFFFFFFFF                            [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

................................                                         [ 76%]

tests/unit/test_torch_dataloader.py ......FF..FF..FFFFFFFFFFFFFF         [ 80%]

tests/unit/test_workflow.py ............................................ [ 88%]

...................................................................      [100%]
=================================== FAILURES ===================================

_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd33852ddd0>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0f290>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

----------------------------- Captured stderr call -----------------------------

2020-11-03 18:13:39.416661: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

2020-11-03 18:13:39.439267: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz

2020-11-03 18:13:39.440344: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca35f25450 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-11-03 18:13:39.440403: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

2020-11-03 18:13:39.757543: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca35ff6990 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-11-03 18:13:39.757587: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-03 18:13:39.757597: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-03 18:13:39.757605: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-03 18:13:39.757612: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-03 18:13:39.760091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-03 18:13:39.761262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:

pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-03 18:13:39.762422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:

pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-03 18:13:39.763476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:

pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-03 18:13:39.763567: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-03 18:13:39.763608: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

2020-11-03 18:13:39.763635: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10

2020-11-03 18:13:39.763660: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10

2020-11-03 18:13:39.763682: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10

2020-11-03 18:13:39.763705: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10

2020-11-03 18:13:39.763729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-11-03 18:13:39.773326: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3

2020-11-03 18:13:39.773405: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-03 18:13:39.779189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-11-03 18:13:39.779213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 1 2 3

2020-11-03 18:13:39.779223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N Y Y Y

2020-11-03 18:13:39.779258: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1:   Y N Y Y

2020-11-03 18:13:39.779269: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2:   Y Y N Y

2020-11-03 18:13:39.779276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3:   Y Y Y N

2020-11-03 18:13:39.784611: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)

2020-11-03 18:13:39.786085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15212 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)

2020-11-03 18:13:39.787529: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15212 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)

2020-11-03 18:13:39.788961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15212 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)

_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd32409ce90>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e5d40>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f86bc710>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87f87a0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd30819d110>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd30819f320>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3241809d0>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a5d8c0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f851a6d0>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e4d0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3240ff2d0>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3249ff320>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3081b4850>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3249ff710>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f84e8b10>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e4d0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3240494d0>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87f2950>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd32416e950>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3081b8950>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd3241167d0>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:60:

nvtabular/workflow.py:846: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd324a0e170>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name0-cont_names0-cat_names0-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd30813a3d0>

engine = 'parquet', cat_names = ['name-cat', 'name-string']

cont_names = ['x', 'y', 'id'], label_name = ['label']
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_format=None,


    )

tests/unit/test_torch_dataloader.py:70:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f863e680>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name0-cont_names0-cat_names1-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co1')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29016d1d0>

engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']

label_name = ['label']
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_format=None,


    )

tests/unit/test_torch_dataloader.py:70:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f85540e0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name1-cont_names0-cat_names0-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f85cae90>

engine = 'parquet', cat_names = ['name-cat', 'name-string']

cont_names = ['x', 'y', 'id'], label_name = []
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_format=None,


    )

tests/unit/test_torch_dataloader.py:70:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f85c1d40>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name1-cont_names0-cat_names1-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co1')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29031c550>

engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']

label_name = []
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_format=None,


    )

tests/unit/test_torch_dataloader.py:70:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8374170>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_1e_0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f856ab90>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2906b6cb0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_0_00')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290590f50>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd290236c20>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_1e0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290556e90>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd290683ef0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_0_0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd324115d90>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f84078c0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_10')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290535b90>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2902368c0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-100-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_00')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2901c28d0>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd3080960e0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_10')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29023fe10>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8117710>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_11')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd29010aad0>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd29055e290>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_12')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f8551050>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e80e0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_13')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2905de210>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2902368c0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_14')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f8637610>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'

devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f8232dd0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_15')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd2f85c3fd0>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:112:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f848aef0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________________________ test_kill_dl[parquet-1e-06] __________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_1e_06_0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290559a90>

part_mem_fraction = 1e-06, engine = 'parquet'
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_path=output_train,


    )

tests/unit/test_torch_dataloader.py:183:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f87e85f0>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

__________________________ test_kill_dl[parquet-0.1] ___________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_0_1_0')

df =      name-cat name-string    id  label         x         y

0     Charlie       Edith  1024   1054  0.763430 -0.231628

...ay   929    972 -0.219996 -0.200242

2160   Ursula         Ray  1045    979 -0.482353  0.136629
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fd290391a90>

part_mem_fraction = 0.1, engine = 'parquet'
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_path=output_train,


    )

tests/unit/test_torch_dataloader.py:183:

nvtabular/workflow.py:784: in apply

dtypes=dtypes,

nvtabular/workflow.py:887: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:635: in exec_phase

_ddf = self._aggregated_dask_transform(_ddf, transforms)

nvtabular/workflow.py:604: in _aggregated_dask_transform

meta = logic(meta, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:82: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fd2f81e7560>

fill_value = 999.5
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

=============================== warnings summary ===============================

tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 11 warnings

tests/unit/test_workflow.py: 3 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35845 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_layers.py: 130 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd290235850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f843f250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f843f250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f843f750>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f843f750>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd2f843f750>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd29056d1d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd2f8753b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd2f8753b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fd308157990>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fd308157990>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fd308157990>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         134     12     81      5    87%   27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              11      0      4      0   100%

nvtabular/framework_utils/torch/models.py                         24      0      8      1    97%   80->82

nvtabular/framework_utils/torch/utils.py                          31      7     10      3    76%   51->52, 52, 55->56, 56-58, 61->67, 67-69

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      2      4      1    81%   31->32, 32, 37

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      3     40      3    96%   54->55, 55-59, 87->89, 89, 182->184

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      188     25     60     10    84%   69->70, 70, 75-76, 104->105, 105, 118->119, 119, 133->134, 134, 139->140, 140, 144-145, 149-153, 156, 223->225, 225, 230->231, 231-235, 246->247, 247, 269->270, 270-271, 300->307, 307-308, 316-317

nvtabular/loader/tensorflow.py                                   110     22     48     12    76%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 274->275, 275, 281->282, 282, 288-291, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336

nvtabular/loader/tf_utils.py                                      51      7     20      5    83%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64

nvtabular/loader/torch.py                                         48     10     10      0    72%   27-29, 32-38

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      384     59    206     41    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      4     10      3    80%   66->67, 67, 107->108, 108, 111->114, 114-115

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               80      3     30      3    95%   146->147, 147, 151->176, 183->184, 184, 208

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         24      2      8      2    88%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         19      1      8      2    89%   43->42, 45->46, 46

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               41      6     10      2    80%   42-46, 68->69, 69-71, 88->89, 89

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            448     16    248     23    94%   105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3282    466   1370    177    83%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 82.59%

=========================== short test summary info ============================

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names0-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names0-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06] - Typ...

FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-0.1] - TypeE...

===== 30 failed, 544 passed, 8 skipped, 201 warnings in 333.81s (0:05:33) ======

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins3484142754550874042.sh

benfred · 2020-11-03T18:17:33Z

rerun tests

nvidia-merlin-bot · 2020-11-03T18:25:33Z

Click to view CI Results

GitHub pull request #379 of commit 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235, no merge conflicts. Running as SYSTEM Setting status of 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1110/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/379/*:refs/remotes/origin/pr/379/* # timeout=10 > git rev-parse 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235^{commit} # timeout=10 Checking out Revision 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10 Commit message: "Merge branch 'main' into fc_matching" > git rev-list --no-walk 0b06b433455ca4dd796e8cbfe55d0d4f4ba3f235 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins3193439717990562499.sh Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: finished with status 'done' Preparing wheel metadata: started Preparing wheel metadata: finished with status 'done' Installing collected packages: nvtabular Attempting uninstall: nvtabular Found existing installation: nvtabular 0.2.0 Uninstalling nvtabular-0.2.0: Successfully uninstalled nvtabular-0.2.0 Running setup.py develop for nvtabular Successfully installed nvtabular All done! ✨ 🍰 ✨ 76 files would be left unchanged. /var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images warn(f"Likely recursive symlink detected to {resolved_path}") Skipped 1 files ============================= test session starts ============================== platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0 collected 582 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 27%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 36%]
........................................................................ [ 48%]
....................................................................... [ 60%]
tests/unit/test_s3.py .. [ 61%]
tests/unit/test_tf_dataloader.py ............ [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
................................ [ 76%]
tests/unit/test_torch_dataloader.py ............................ [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
................................................................... [100%]

=============================== warnings summary ===============================
tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 15 warnings
tests/unit/test_workflow.py: 3 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44823 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_layers.py: 130 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dense_embedding_layer[stack]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5a6590>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5a9950>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5a9950>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5ad110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5df110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c636510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c636510>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:103: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fe76c5adb90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 36504 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38520 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 39744 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40212 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 40032 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 38880 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 77760 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 134 12 81 5 87% 27->28, 28, 51->60, 60, 68->49, 190-198, 201, 294->302, 315->318, 321-322, 325
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 11 0 4 0 100%
nvtabular/framework_utils/torch/models.py 24 0 8 1 97% 80->82
nvtabular/framework_utils/torch/utils.py 31 7 10 3 76% 51->52, 52, 55->56, 56-58, 61->67, 67-69
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 182->184
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 188 8 60 5 95% 69->70, 70, 133->134, 134, 144-145, 156, 231->233, 246->247, 247, 269->270, 270-271
nvtabular/loader/tensorflow.py 110 17 48 11 81% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307, 333->336, 336
nvtabular/loader/tf_utils.py 51 7 20 5 83% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 56->64, 59-64
nvtabular/loader/torch.py 48 10 10 0 72% 27-29, 32-38
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 384 59 206 41 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 280->281, 281, 284->290, 360->361, 361-363, 365->366, 366, 367->368, 368, 390->393, 393, 403->404, 404, 409->413, 413, 437->438, 438-439, 441->442, 442-443, 445->446, 446-462, 464->468, 468, 472->473, 473, 474->475, 475, 482->483, 483, 484->485, 485, 490->491, 491, 500->507, 507-508, 512->513, 513, 525->526, 526, 527->531, 531, 534->552, 552-555, 578->579, 579, 582->583, 583, 584->585, 585, 592->593, 593, 594->597, 597, 704->705, 705, 706->707, 707, 738->753, 776->777, 777, 793->798, 796->797, 797, 807->804, 812->804, 819->820, 820
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 80 3 30 3 95% 146->147, 147, 151->176, 183->184, 184, 208
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 41 6 10 2 80% 42-46, 68->69, 69-71, 88->89, 89
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 448 16 248 23 94% 105->109, 109, 115->116, 116-120, 150->exit, 166->exit, 182->exit, 198->exit, 251->253, 301->302, 302, 381->384, 384, 409->410, 410, 416->419, 419, 527->526, 577->582, 582, 585->586, 586, 629->630, 630, 698->686, 826->832, 832->exit, 874->875, 875, 884->890, 926->927, 927-929, 933->934, 934, 969->970, 970
setup.py 2 2 0 0 0% 18-20

TOTAL 3282 440 1370 169 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.49%
=========== 574 passed, 8 skipped, 212 warnings in 455.61s (0:07:35) ===========
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins6301827228761798190.sh

andrey-klochkov-liftoff · 2021-08-20T20:36:39Z

nvtabular/framework_utils/tensorflow/feature_column_utils.py

+            # boundaries and embedding dim so that we can wrap
+            # with either indicator or embedding later
+            if key in [col.key for col in numeric_columns]:
+                buckets[key] = (column.boundaries, embedding_dim)


I believe here and two lines below it should be cat_column.boundaries. It fails to find the boundaries attributes in my attempts to use this utility, and it passes if I replace column.boundaries with cat_column.boundaries.

…kflow mapping function (#379)

adding ops and feature column utils

bd068b8

importing hashed cross in ops

951b025

switching to xor

70c3a48

hashed cross and workflow builder working

319d475

adding bucketize

46abb5b

adding op tests

d2f1efd

blackening

026d2fe

blackening

ebf2347

documenting

66419f4

updated blackening

1446a64

fixing formatting issues

8b71151

Alec Gunny added 2 commits October 23, 2020 13:13

fixing formatting issues

4afd772

fixing formatting issues

5a22fb4

fixing issues

1a8dea4

fixing bucketized behavior

03a5063

fixing some bucket stuff

7cc9ec6

changing preprocess and features in feature column utils

5b8cdf0

changing bucketize to geq and updating notebook

aad5acc

blackening

4319ccd

writing updates to notebook

84f3cb2

benfred reviewed Oct 30, 2020

View reviewed changes

benfred changed the title ~~[REVIEW] Adding ops for feature column functionality and feature column to workflow mapping function~~ Adding ops for feature column functionality and feature column to workflow mapping function Nov 2, 2020

benfred added 2 commits November 3, 2020 09:39

Apply suggestions from code review

4c7c31b

Merge branch 'main' into fc_matching

31fa9f9

benfred approved these changes Nov 3, 2020

View reviewed changes

black

bd6ec9f

Merge branch 'main' into fc_matching

0b06b43

benfred merged commit ca38bad into NVIDIA-Merlin:main Nov 3, 2020

alecgunny mentioned this pull request Nov 3, 2020

[FEA] Hashed cross op #371

Closed

andrey-klochkov-liftoff reviewed Aug 20, 2021

View reviewed changes

mikemckiernan pushed a commit that referenced this pull request Nov 24, 2022

Adding ops for feature column functionality and feature column to wor…

df1e546

…kflow mapping function (#379)

		for column in columns:
		val ^= gdf[column].hash_values() # or however we want to do this aggregation

Adding ops for feature column functionality and feature column to workflow mapping function #379

Adding ops for feature column functionality and feature column to workflow mapping function #379

Conversation

alecgunny commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

alecgunny commented Oct 23, 2020

alecgunny commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

nvidia-merlin-bot commented Oct 23, 2020

alecgunny commented Oct 23, 2020

nvidia-merlin-bot commented Oct 26, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Oct 26, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Oct 26, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Oct 26, 2020

nvidia-merlin-bot commented Oct 27, 2020

nvidia-merlin-bot commented Oct 27, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Oct 27, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

benfred left a comment

Choose a reason for hiding this comment

benfred Oct 30, 2020

Choose a reason for hiding this comment

alecgunny Oct 30, 2020

Choose a reason for hiding this comment

alecgunny Nov 2, 2020

Choose a reason for hiding this comment

benfred Nov 3, 2020

Choose a reason for hiding this comment

nvidia-merlin-bot commented Nov 3, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 3, 2020

nvidia-merlin-bot commented Nov 3, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 3, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

benfred commented Nov 3, 2020

nvidia-merlin-bot commented Nov 3, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

andrey-klochkov-liftoff Aug 20, 2021 • edited

Choose a reason for hiding this comment

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

andrey-klochkov-liftoff Aug 20, 2021 •

edited