Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visartm docs #746

Merged
merged 3 commits into from
Jan 29, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/visartm/datasets.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Datasets
====================

Object model
-----------------------------------
**Dataset** is a collection of **documents**. It contains vocabulary (set of **terms**) and **documents**.

Documents technically are multisets or lists of terms. Also they can contain *raw text*.

Each term belongs to **modality**. If **modalities** are undefined, all terms belongs to **modality** @default_class.

**ArtmModel** is topic model.

Dataset creating
-----------------------------------

Go to folder **visartm/data/datasets**. There create a folder, named after your dataset. Put there single file named **vw.txt**, which describes your dataset in Vowpal Vabbit format. This file is necessary and enough to go on.

Then go to **Datasets**, click on **Create new dataset**, choose tab **Local**, select in combo-box **Folder** folder, that yo have created and click **Create**.

8 changes: 6 additions & 2 deletions docs/visartm/index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,9 @@ VisARTM

.. toctree::
:maxdepth: 2

intro

installation
datasets
models
visualizations

60 changes: 60 additions & 0 deletions docs/visartm/installation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
Installation
====================

You can deploy VisARTM locally on your computer. It supports any OS: Windows, Linux or MacOS. Please follow next steps.

1. Make sure you have installed python 3. We recommend use `Anaconda <https://www.continuum.io/downloads>`_.

2. Make sure you have installed `BigARTM <http://bigartm.readthedocs.io/en/master/installation/index.html>`_.

3. Install django with console command **pip install django**.

4. Install `PostgreSQL <https://www.postgresql.org/download/>`_ and `pgAdmin <https://www.pgadmin.org/>`_. Of cource, ou can use any database management system with django, but we recommend PostgreSQL.

5. Open pgAdmin and create database. Please remember username and password to this database. Default username in PostgreSQL is "postgres".

6. Make sure you have installed git.

7. Clone VisARTM with console command

.. code-block:: bash

git clone https://github.com/bigartm/visartm.git

8. Now link database created in step 5 with VisARTM. For that, open file **visartm/settings.py** and find following lines:

.. code-block:: python

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'artmonlinedb',
'USER': 'postgres',
'PASSWORD': '******',
'HOST': '127.0.0.1',
'PORT': '5432',
}
}

Modify values as follows. NAME is name of database. USER is name of user of database. PASSWORD is his password. If you use local PostgreSQL, you will not need change ENGINE, otherwise it's up to you. You are running database server locally, so don't modify HOST value.

9. Now you need django to create tables. For that, go to folder named **visartm** and run:

.. code-block:: bash

python manage.py makemigrations
python manage.py migrate

10. Create superuser for the service. Django will ask you for username and passwrod. Please remember them, you will need them to use service.

.. code-block:: bash

python manage.py createsuperuser

11. Run the server.

.. code-block:: bash

python manage.py runserver

12. Open your favorite web browser and navigate to http://127.0.0.1:8000
1 change: 1 addition & 0 deletions docs/visartm/intro.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
Introduction to VisARTM
====================

This is introduction.
30 changes: 30 additions & 0 deletions docs/visartm/models.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Model creation
====================

Automatic model creation
----------------------------------
For quick start, use built-in model builder.

Go to **Datasets**, select dataset, click **Create new model**. Select tab **Flat** or **Hierarchial**, choose number of topcs and iterations and click **Create**.

Creating model with script
-----------------------------------
Go to **Datasets**, select dataset, click **Create new model**. Select tab **Empty** and click **Create**.

VisARTM will create a folder and shows you full path to that folder. There you will find sample script named **sample.py**.
This script will tell you, how initialize ARTM model with dataset, and where to put result.

For a flat model, result is two matrices **theta** and **phi**. You should save them to folder of model, using **to_pickle()** methd
of **pandas.DataFrame**.

When matrices are ready, just press **Reload** on page of model.

Hierarchical models
-----------------------------------
If you are building N-tier hierarchy using hARTM, it is necessary to save also files **psi1**, **psi2**, ..., **psi(N-1)**. Save them also using
**to_pickle()**, next to **theta** and **phi**.

Multiple models
-----------------------------------
You can create another model. Just go to dataset page and click **Create new model**. You can delete model from page of models, or delete all models, using option **Clear** on dataset page. List of all models is available on the right of dataset page.

8 changes: 8 additions & 0 deletions docs/visartm/visualizations.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Visualizations
====================

You can explore documents, topic, terms, modalities and model itself.

Besides, you can look at differnet visualizations of model in whole.

If you have several models, choose model to be visualized using radio-buttons at dataset page.