This repository has been archived by the owner on Dec 12, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
595375d
commit 8a6ab30
Showing
6 changed files
with
38 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
Create Model | ||
------------ | ||
|
||
DDT incrementally builds a model as the user `annotates <http://domain-discovery-tool.readthedocs.io/en/latest/use.html#annotation>`_ the retrieved pages. The accuracy of the domain model is displayed on the top right corner. It provides an indication of the model coverage of the domain and how it is influenced by annotations. | ||
|
||
The domain model can be exported by clicking on the **Model** button on the top. This will show a drop down as shown in figure below: | ||
|
||
.. image:: model_dropdown.png | ||
:width: 800px | ||
:align: center | ||
:height: 400px | ||
:alt: alternate text | ||
|
||
Click on **Create Model** to export the model. This should bring up a file explorer pop-up (makes sure you enable pop-up on your browser) as shown below. Save the compressed model file. This contains the ACHE classifier model, the training data for the model and the initial seed list required for crawling. | ||
|
||
.. image:: model_download.png | ||
:width: 800px | ||
:align: center | ||
:height: 400px | ||
:alt: alternate text | ||
|
||
Annotation | ||
~~~~~~~~~~ | ||
|
||
Currently, pages can be annotated as Relevant, Irrelevant or Neutral using the |tag_all| buttons respectively to tag all pages in the current view. |tag_one| buttons can be used to tag individual pages. Annotations are used to build the domain model. | ||
|
||
.. |tag_all| image:: tag_all.png | ||
|
||
.. |tag_one| image:: tag_one.png | ||
|
||
Note: | ||
|
||
* At least 10 pages each of relevant and irrelevant pages should be annotated to build the model. The more the annotations, hence the better coverage of the domain, the better the domain model. | ||
* Ensure that the relevant and irrelevant page annotations are balanced for a better model. | ||
|
||
|
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters