The program consists of two modules, a RDF_processor and a model.
It also contains two demonstration scripts, build_model.py
, which can be used to create a model from serialized hash arrays or RDF triples and predict.py
which provides predictions for a stored model.
The model
file contains a prebuilt model which can be used by the predict script
Constructs and stores a set of <subjects>
from ident_file
with RDF:type == object
using the Redland Python bindings.
The format of the RDF file should be:
<subject> RDF:type <object>.
ident_file
: Name of the RDF turtle file to be parsed.object
: URI of the parsing<object>
Constructs and stores an array of <object>
strings for FOAF:Name
predicates, and a corresponding identifier array describing the <subjects>
's presence in the stored set of subjects.
The format of the RDF file should be:
<subject> FOAF:Name <object>.
map_file
: Name of the RDF turtle file to be parsed & mappedbalance
: IfTrue
, balances the arrays by downsampling the more prevelant category.
Tokenises the array of object strings and hashes them using mmh3 to create and store a scipy_dok
sparse matrix.
mapping_size
: The range of the hashes, between [-mapping_size, +mapping_size]
Shuffles the subject, features and identifier arrays.
Returns the current feature array.
Returns the array of identifiers.
Returns the array of subject strings.
size
: Number of features in the array to be modelledbatch_size
: The maximum size of each batchalpha
: The learning rate for batch SGDC
: The L2 regularization term
Fits dataset X
to target Y
by minimizing the logistic cost function using Mini-batch Gradient Descent with L2 regularization.
X
: The array to be fitted. Of shape (n_samples, n_features)Y
: The target array forX
. Of shape (n_samples)
Predicts the value of X
using the fitted model.
X
: Value to be fitted
Returns the mean successful prediction rate for X
against targets Y
on the fitted model.
X
: The array to be predictedY
: The targets to be compared against
- Python 2.7
- numpy
- scipy
- mmh3
- Redland Python bindings
- cPickle