New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synthpop freezes on dataset with factor attributes #20
Comments
Dear Daniel,
I'm, afraid the code you have sent appears in an image that is unreadable. The same was true on github. Without seeing the details I can't see what is going on. but I'm sure it is NOT because of factors that have more than two factor, which are handled OK, though computational problems can happen with lots of levels (e.g. >15).
Best Gillian
Gillian M Raab
Emeritus Professor, Edinburgh Napier University
Part-time Research Fellow
Administrative Data Research Centre - Scotland
Edinburgh
+44 7748 678 551
…________________________________
From: danielamartinezd02 ***@***.***>
Sent: 22 March 2022 10:08
To: bnowok/synthpop ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [bnowok/synthpop] Synthpop freeze on dataset with factor attributes (Issue #20)
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
The synthetization process freezes for a dataset with factor attributes, when there are more than 2 classes.
[image]<https://user-images.githubusercontent.com/58200257/159455975-6a72d404-19bb-472c-adb3-4baf62066bdc.png>
My R version is 4.1.2 and the synthpop version is 1.7.0.
—
Reply to this email directly, view it on GitHub<#20>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7GPQ2HXFRVD3U6NWIDVBGL3FANCNFSM5RKLUXXA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
|
Dear Gilian, There is one attribute with lots of levels where it gets stuck ('Country'). However I also deleted it, and it is not the only one getting stuck also for 3 levels I am having problems. I attach the dataset I am trying to synthesize. |
I can have a look at some time later but I'm pretty busy just now. Meanwhile things you should try are
1. changing ordering of synthesis
2. Restricting predictor matrix to exclude variables with many levels
3. Using method nested if you can define wider groups from your many-level variables.
This paper https://arxiv.org/pdf/1712.04078.pdf may give you some hints, although it is pretty old now.
BEST gILLIAN
arXiv:1712.04078v1 [stat.AP] 12 Dec 2017<https://arxiv.org/pdf/1712.04078.pdf>
4 The sta member producing synthetic data can control the synthesis process in various ways, where the three main parameters are 1. Synthesis method(s) A di erent method can be speci ed for each variable.
arxiv.org
Gillian M Raab
Emeritus Professor, Edinburgh Napier University
Part-time Research Fellow
Administrative Data Research Centre - Scotland
Edinburgh
+44 7748 678 551
…________________________________
From: danielamartinezd02 ***@***.***>
Sent: 23 March 2022 07:58
To: bnowok/synthpop ***@***.***>
Cc: RAAB Gillian ***@***.***>; Comment ***@***.***>
Subject: Re: [bnowok/synthpop] Synthpop freezes on dataset with factor attributes (Issue #20)
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
Dear Gilian,
There is one attribute with lots of levels where it gets stuck ('Country'). However I also deleted it, and it is not the only one getting stuck also for 3 levels I am having problems. I attach the dataset I am trying to synthesize.
mental_health_train_data_all.csv<https://github.com/bnowok/synthpop/files/8330951/mental_health_train_data_all.csv>
Best,
Daniela
—
Reply to this email directly, view it on GitHub<#20 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7AZ2EQJHO2HPBVOXITVBLFKPANCNFSM5RKLUXXA>.
You are receiving this because you commented.Message ID: ***@***.***>
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhun Eideann, claraichte an Alba, aireamh claraidh SC005336.
|
Dear Gillian, Thanks for your response. I will take a look to what you have suggested. Best, |
hey @danielamartinezd02, were you able to solve this problem? Because I've been having the exact same problem with the UCI Adult's Census Dataset. |
Did you look at the paper I suggested in my reply on github?
If so and it did not help then perhaps send me more details of your problem.
Gillian
Gillian M Raab
Research Fellow (part-time)
Scottish Centre for Administrative Data Research
My core working days are Tuesdays and Thursdays
Though I sometimes swap them for other days
07748 678 551
…________________________________
From: Roham Koohestani ***@***.***>
Sent: 30 November 2023 10:15
To: bnowok/synthpop ***@***.***>
Cc: Gillian Raab ***@***.***>; Comment ***@***.***>
Subject: Re: [bnowok/synthpop] Synthpop freezes on dataset with factor attributes (Issue #20)
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
hey @danielamartinezd02<https://github.com/danielamartinezd02>, were you able to solve this problem? Because I've been having the exact same problem with the UCI Adult's Census Dataset.
—
Reply to this email directly, view it on GitHub<#20 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7ABNLIXKLNBSSJRND3YHBMFFAVCNFSM5RKLUXXKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTGM2DMNZUHAYQ>.
You are receiving this because you commented.Message ID: ***@***.***>
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
|
The synthetization process never finishes just freezes for a dataset with factor attributes, when there are more than 2 classes.
My R version is 4.1.2 and the synthpop version is 1.7.0.
The text was updated successfully, but these errors were encountered: