Fix examples/Discovery_LUCAS.ipynb #15

jayavanth · 2019-05-01T19:49:16Z

Cgnn.predict(data, graph=ugraph, nb_runs=16, train_epochs=1500, test_epochs=1000) CGNN predict function doesn't accept nb_runs, train_epochs and test_epochs anymore. It has to be called like this:

Cgnn = CGNN(nb_runs=16, train_epochs=1500, test_epochs=1000)
Cgnn.predict(data, graph=ugraph)

The text was updated successfully, but these errors were encountered:

diviyank · 2019-05-07T12:20:20Z

Yes, I should fix the example ! Thanks for the feedback !

gkericks · 2019-05-24T18:17:04Z

I'm not sure this is related, but I am looking for an explanation of how the NUM_LUCAS.csv file was generated and can't find it. Do you have that listed somewhere?

diviyank · 2019-05-24T18:20:56Z

Hi,
Actually, NUM_LUCAS.csv was generated using the cdt.generators.AcyclicGraphGenerator class, by feeding it a ground truth graph. But yes, it doesn't make much sense to call it LUCAS, since it doen't have much to do with the true dataset except for the variables names and the graph structure, I should change that. I will add it on the next version
Best.
Diviyan

gkericks · 2019-05-24T19:17:54Z

@Diviyan-Kalainathan Thanks for the quick reply!

Okay so from that I see now that the example is about recreating the answer graph just using examples sampled from it. The original LUCAS data is all binary and this new dataset assumes guassians at every node (the sampled data looks standardized). That being said, what constraints on the data input are there for effectively using your library?

I have a causal problem I am trying to solve and like most real-world data, the input is of mixed types. Some numerical, some categorical. Would you still recommend your library for exploring the dependencies or should I be looking for a different technique? I apologize in advance if that is already covered in your README and I just missed it.

diviyank · 2019-05-27T08:17:24Z

Hi,
There are no constraints on the data input for the library. Instead, it depends on the algorithms from the package. For example, SAM and CGNN accept only numerical data, whereas PC can accept categorical data. For mixed types, I don't know of an algorithm or statistical test that is quite efficient ; I think your best bet would be to discretize your data and use an algorithm/test for categorical data (PC/ GES ).

Best regards,
Diviyan

diviyank · 2019-06-11T15:59:00Z

It should be fixed, sorry for the delay, but we really wanted to fix all the issues on dataset management before fixing this issue.
Please keep me updated.
Best,
Diiviyan

diviyank · 2019-07-08T13:59:28Z

I will be closing this issue, as it should be solved. Don't hesitate to reopen it if the bug still persists in the latest version.
Best,
Diviyan

diviyank added a commit that referenced this issue Jun 10, 2019

[EXP] Fix (#15)

d1f2bc8

diviyank added a commit that referenced this issue Jun 11, 2019

[EXP] Ran Example + Fix (#15)

b083c40

diviyank closed this as completed Jul 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix examples/Discovery_LUCAS.ipynb #15

Fix examples/Discovery_LUCAS.ipynb #15

jayavanth commented May 1, 2019

diviyank commented May 7, 2019

gkericks commented May 24, 2019

diviyank commented May 24, 2019

gkericks commented May 24, 2019

diviyank commented May 27, 2019

diviyank commented Jun 11, 2019

diviyank commented Jul 8, 2019

Fix examples/Discovery_LUCAS.ipynb #15

Fix examples/Discovery_LUCAS.ipynb #15

Comments

jayavanth commented May 1, 2019

diviyank commented May 7, 2019

gkericks commented May 24, 2019

diviyank commented May 24, 2019

gkericks commented May 24, 2019

diviyank commented May 27, 2019

diviyank commented Jun 11, 2019

diviyank commented Jul 8, 2019