GitHub - nik0spapp/wmil: Weighted multiple-instance learning algorithm

wmil — The attached code is a Python implementation of the multiple-instance learning algorithm for aspect-based sentiment analysis which was proposed in the paper listed below. Moreoever, the features extracted from seven datasets are provided for research purposes.

@InProceedings{pappas14,
  author    = {Pappas, Nikolaos  and  Popescu-Belis, Andrei},
  title     = {Explaining the Stars: Weighted Multiple-Instance Learning for Aspect-Based Sentiment Analysis},
  booktitle = {Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  month     = {October},
  year      = {2014},
  address   = {Doha, Qatar},
  publisher = {Association for Computational Linguistics},
  pages     = {455--466},
  url       = {http://www.aclweb.org/anthology/D14-1052}
}

A more scalable version of the above algorithm based on stochastic gradient descent can be found here: wmil-sgd.

Installing dependencies

The available code requires Python programming language and pip package manager to run. For detailed instructions on how to install it along with a package manager please refer to the following links: http://www.python.org/getit/ and http://www.pip-installer.org/en/latest/.

Next, you should be able to install the following packages:

$ pip install numpy 
$ pip install scikit-learn
$ pip install scipy

Training and testing the model

The code extends BaseEstimator class from scikit-learn package, so you should be able to use it as a common sklearn estimator (check more details on http://scikit-learn.org/stable/). For example:

$ python
>>> import pickle
>>> from wmil import APWeights
>>> from sklearn.metrics import mean_absolute_error
>>> data = pickle.load(open('features/ted_comments.p'))
>>> size = len(data['X'])
>>> k = int(size*0.5)
>>> x_train = data['X'][:k]
>>> y_train = data['Y'][:k]
>>> x_test = data['X'][k:]
>>> y_test = data['Y'][k:]
>>> model = APWeights(20, e1=1.0, e2=1.0, e3=1.0)
>>> model.fit(x_train, y_train)
[+] Training...
--/start
iteration 0 -> (MAE: 0.103437)
iteration 1 -> (MAE: 0.089629)
iteration 2 -> (MAE: 0.087793)
iteration 3 -> (MAE: 0.087565)
iteration 4 -> (MAE: 0.087523)
iteration 5 -> (MAE: 0.087515)
iteration 6 -> (MAE: 0.087510)
iteration 7 -> (MAE: 0.087511)
--/end
>>> mean_absolute_error(model.predict(x_train),y_train) # training error
0.096217463769192518
>>> mean_absolute_error(model.predict(x_test), y_test) # testing error
0.16325402985689552

Contact:

npappas@idiap.ch

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
features		features
.gitattributes		.gitattributes
README.md		README.md
crls.py		crls.py
wmil.py		wmil.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installing dependencies

Training and testing the model

Contact:

About

Releases

Packages

Languages

nik0spapp/wmil

Folders and files

Latest commit

History

Repository files navigation

Installing dependencies

Training and testing the model

Contact:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages