Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

febrl model match fails in docker #427

Closed
sonalgoyal opened this issue Aug 8, 2022 · 10 comments
Closed

febrl model match fails in docker #427

sonalgoyal opened this issue Aug 8, 2022 · 10 comments
Assignees
Milestone

Comments

@sonalgoyal
Copy link
Member

running match for febrl on 0.3.4 release gives an error z_sim18 not found. I suspect that the python model configuration is different from that in config.json - leading to this error. need to investigate further

@sonalgoyal sonalgoyal changed the title mismatcj in febrl model from config.json and python febrl model match fails in docker Aug 8, 2022
@sonalgoyal
Copy link
Member Author

@Akash-R-7 can you please check this?

@Akash-R-7
Copy link
Contributor

@sonalgoyal, Problem happening only on docker image, not the local repo. Gives the same error even after similar MATCHTYPE configurations in python file and config.json .

@UsAndRufus
Copy link

Bump on this, just tried the default test run as specified in the README and it doesn't work

@sonalgoyal sonalgoyal assigned vikasgupta78 and unassigned Akash-R-7 Mar 20, 2023
@sonalgoyal sonalgoyal added this to the 0.3.5 milestone Mar 20, 2023
@vikasgupta78
Copy link
Collaborator

I was able to run it using following steps:

Go to folder /zingg/docker/mac (which contains Dockerfile)

docker image build -t zingg/vikas .

=> docker image zingg/vikas will get formed with tar location specified in Dockerfile
=> can be seen in docker desktop

now go to /tmp

docker run -v /tmp:/tmp -it zingg/vikas bash

./scripts/zingg.sh --phase findTrainingData --conf examples/febrl/config.json --zinggDir /tmp/z_docker

@vikasgupta78
Copy link
Collaborator

Tried following:
docker run -v /tmp:/tmp -it zingg/vikas bash
./scripts/zingg.sh --run examples/febrl/FebrlExample.py

error didn't come (by default FebrlExample.py ran trainMatch)

@vikasgupta78
Copy link
Collaborator

vikasgupta78 commented Mar 24, 2023

I also ran the phases 1 by 1 by modifying FebrlExample.py (after deleting models/100), issue not reproduced

@vikasgupta78
Copy link
Collaborator

I tried a combo i.e. findTrainingData, label, train using json and match using FebrlExample.py, this was done after deleting models/100. issue not reproduced

@vikasgupta78
Copy link
Collaborator

Finally reproduced using folllowing:

docker run -v /tmp:/tmp -it zingg/vikas bash
./scripts/zingg.sh --phase match --conf examples/febrl/config.json

. Available: z_z_zid, z_zid, fname, lname, stNo, add1, add2, city, areacode, state, dob, ssn, z_source, z_fname, z_lname, z_stNo, z_add1, z_add2, z_city, z_areacode, z_state, z_dob, z_ssn, z_z_source, z_sim0, z_sim1, z_sim2, z_sim3, z_sim4, z_sim5, z_sim6, z_sim7, z_sim8, z_sim9, z_sim10, z_sim11, z_sim12, z_sim13, z_sim14, z_sim15, z_sim16, z_sim17

=> this issue doesn't occur if FebrlExample.py run in match mode

=> indicates a inconsistency in example model shipped with docker in python vs json as running via python not via json . So could be difference in config of both

=> easy fix to run trainMatch instead of match

@vikasgupta78
Copy link
Collaborator

in case of FebrlExample.py all are fuzzy
while in config.json stNo , areacode are exact
by changing these 2 to fuzzy it worked

vikasgupta78 added a commit to zinggAI/zingg-vikas that referenced this issue Mar 24, 2023
@vikasgupta78
Copy link
Collaborator

fixed in commit 687cef2 , pull request #543

generated the model again and change exact to fuzzy in json where there was a difference

sonalgoyal added a commit that referenced this issue Mar 25, 2023
issue #427 difference between python and json config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants