Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid model item_metadata with nulls #47

Closed
ahuds001 opened this issue May 4, 2022 · 4 comments · Fixed by #52
Closed

Hybrid model item_metadata with nulls #47

ahuds001 opened this issue May 4, 2022 · 4 comments · Fixed by #52
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed invalid This doesn't seem right

Comments

@ahuds001
Copy link
Contributor

ahuds001 commented May 4, 2022

Hi Collie team,

Love the work y'all are doing! I noticed when playing with the hybrid model that if I pass item_metadata that contains nulls the model can be trained and will build through all the stages. It generates metric scores as well but will only generate null predictions. This doesn't seem ideal, and it may be better if one of these things happened:

  • model fails altogether with an error explaining that item_metadata columns cannot contain null values
  • model gives a warning and proceeds with some form of imputation on the columns with null values

I think the first option is a better option personally, but I'm sure y'all would know better than I would.

@ahuds001 ahuds001 added the enhancement New feature or request label May 4, 2022
@nathancooperjones
Copy link
Collaborator

Hi @ahuds001, sorry for the delayed response here.

I totally agree that this seems like a bug, and that the first solution you listed above (raising an error and having the user deal with imputation on their end) seems like the most elegant solution to me.

If you're up for it, it'd be fantastic if you could contribute this into Collie. If not, we can keep this issue open and it'll be something I try to work on when I get some free time sometime soon. I don't think it should be too much lift to add this in - I imagine it'll be a simple check in BasePipeline if the item_metadata argument exists.

Cheers!

@nathancooperjones nathancooperjones added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed invalid This doesn't seem right labels Jun 8, 2022
@ahuds001
Copy link
Contributor Author

Hi @nathancooperjones sorry for the late reply back, I am currently on pat leave so things have been a bit crazy. I have some time here and there now that the kiddo is sleeping better so I'll try to get to this over the next two weeks. If I can't figure it out or don't have time for some reason I'll let you know.

@ahuds001
Copy link
Contributor Author

ahuds001 commented Jun 29, 2022

Hi @nathancooperjones I found an issue with the Dockerfile caused by nvidia changing keys. I am not sure what I did is the best way to fix it but it works. Let me know if you'd like me to add it to the PR, open up a separate issue, or do something else.

# added these to deal with nvidia issues
# GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease:
# The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
# solution from here: https://github.com/NVIDIA/nvidia-container-toolkit/issues/258
RUN rm /etc/apt/sources.list.d/cuda.list
RUN rm /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-key del 7fa2af80
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub

@ahuds001 ahuds001 mentioned this issue Jun 30, 2022
8 tasks
@nathancooperjones
Copy link
Collaborator

Hi @nathancooperjones I found an issue with the Dockerfile caused by nvidia changing keys. I am not sure what I did is the best way to fix it but it works. Let me know if you'd like me to add it to the PR, open up a separate issue, or do something else.

# added these to deal with nvidia issues
# GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease:
# The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
# solution from here: https://github.com/NVIDIA/nvidia-container-toolkit/issues/258
RUN rm /etc/apt/sources.list.d/cuda.list
RUN rm /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-key del 7fa2af80
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub

Ideally, this Dockerfile issue should be fixed in the #53 PR. If you're able to pull down these changes and re-run the run command, that should fix the issue (hopefully)!

@nathancooperjones nathancooperjones linked a pull request Jul 13, 2022 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed invalid This doesn't seem right
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants