Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BDe score #20

Closed
sonujose123 opened this issue Dec 5, 2016 · 6 comments
Closed

BDe score #20

sonujose123 opened this issue Dec 5, 2016 · 6 comments

Comments

@sonujose123
Copy link

I created BN from a BIF file (asia.bif) . But when I try to find the score using BDe with 'lizards.csv' it is failing.

Code -
from pyBN import *
import numpy as np
import os
from os.path import dirname

file = 'data/asia.bif'
bn = read_bn(file)
dpath = os.path.join(dirname(dirname(dirname(dirname(file)))),'data')
path = (os.path.join(dpath,'lizards.csv'))
data = np.loadtxt(path, dtype='int32',skiprows=1,delimiter=',')

print BDe(bn,data)

Error

Traceback (most recent call last):
File "test.py", line 12, in
print BDe(bn,data)
File "/home/sonu/Documents/pyBN/pyBN-master/pyBN/learning/structure/score/bayes_scores.py", line 61, in BDe
counts_dict = mle_fast(bn, data, counts=True, np=True)
File "/home/sonu/Documents/pyBN/pyBN-master/pyBN/learning/parameter/mle.py", line 41, in mle_fast
F[n]['values'] = list(nmp.unique(data[:,i]))
IndexError: index 3 is out of bounds for axis 1 with size 3

@ncullen93
Copy link
Owner

I think this is because the network and the data don't match.. The andes network has these nodes: ['asia', 'smoke', 'tub', 'bronc', 'lung', 'either', 'dysp', 'xray'], whereas the lizards dataset only has 3 columns (nodes).. The BDe score measures how well a given BN fits a COMPATIBLE dataset (i.e. the nodes of the BN match up with the columns of the dataset). :)

@ncullen93
Copy link
Owner

Note, you can try LEARNING the structure from the lizards.csv dataset and check the BDe score.. OR you can generate your own random andes dataset and check the BDe score.

@sonujose123
Copy link
Author

sonujose123 commented Dec 5, 2016 via email

@sonujose123
Copy link
Author

Hi Nicholas,
I tried with modified csv with asia.bif . Then also I am facing the same issue.
I got this error -
python test.py
['asia', 'smoke', 'tub', 'bronc', 'lung', 'either', 'dysp', 'xray']
('asia', 'tub')
('smoke', 'bronc')
('smoke', 'lung')
('tub', 'either')
('bronc', 'dysp')
('lung', 'either')
('either', 'dysp')
('either', 'xray')
/home/sonu/Documents/pyBN/pyBN-master/pyBN/learning/parameter/mle.py:48: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
F[rv]['cpt'] = nmp.histogram(data[:,rv], bins=bn.card(rv))[0]
Traceback (most recent call last):
File "test.py", line 20, in
print BDe(bn,data)
File "/home/sonu/Documents/pyBN/pyBN-master/pyBN/learning/structure/score/bayes_scores.py", line 61, in BDe
counts_dict = mle_fast(bn, data, counts=True, np=True)
File "/home/sonu/Documents/pyBN/pyBN-master/pyBN/learning/parameter/mle.py", line 48, in mle_fast
F[rv]['cpt'] = nmp.histogram(data[:,rv], bins=bn.card(rv))[0]
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

Please help me.

Regards
asia.bif.zip

Sonu
lizards.csv.zip

@ncullen93
Copy link
Owner

Ok i think i fixed it... pandas must have changed their indexing since I wrote this. It should work if 'data' is a pandas dataframe whose columns are same as BN nodes.. but i think it will now be broken if data is numpy array.

@ncullen93
Copy link
Owner

Pull the repository and try again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants