Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moss project #127

Merged
merged 66 commits into from Jan 16, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
7cd70a4
Simplifies rule lookup for simple stores.
zorkow Sep 10, 2016
c5910da
Comments out API consistency tests.
zorkow Sep 10, 2016
f3239fe
Merge branch 'moss_project' into moss_lookup_simple_rules
zorkow Sep 10, 2016
8848ce8
Merge pull request #97 from zorkow/moss_lookup_simple_rules
zorkow Sep 10, 2016
e29d753
Cleanup after WP 1.1 + 1.2.
zorkow Sep 20, 2016
aaa49f5
Development documentation for MOSS project.
zorkow Sep 23, 2016
601aac8
Merge branch 'develop' into moss_project
zorkow Sep 25, 2016
a92af82
Some constraint refactoring.
zorkow Oct 6, 2016
3f701d7
Adds data structure of dynamic properties.
zorkow Oct 6, 2016
79591e7
Introduces explicit fallback values for comparators.
zorkow Oct 7, 2016
3dc9169
Dynamic constraint comparison with fallbacks and preference ordering.
zorkow Oct 8, 2016
3e0ddab
Merge branch 'master' into moss_project
zorkow Oct 15, 2016
1124a26
Merge branch 'develop' into moss_project
zorkow Oct 15, 2016
a1335c8
Basic Trie data structure.
zorkow Nov 2, 2016
6afec93
Basic integration of Trie.
zorkow Nov 2, 2016
845886d
Filling the trie.
zorkow Nov 2, 2016
ed787ab
Cleanup and renaming.
zorkow Nov 2, 2016
2f2cd59
Code cleanup.
zorkow Nov 2, 2016
0c15d47
Additions to the project doc.
zorkow Nov 2, 2016
c13d233
Rule retrival and initial integration into engine.
zorkow Nov 3, 2016
a817fb2
Introduces a single indexing trie in the combined store. Makes intial…
zorkow Nov 5, 2016
a3de592
Some code cleanup.
zorkow Nov 7, 2016
1df30c6
Turns rule collection from recursive to iterative.
zorkow Nov 7, 2016
0c079ee
Some refactoring and cleanup.
zorkow Nov 8, 2016
e783a89
Introduces some efficient functions for common Xpath expressions.
zorkow Nov 8, 2016
60a0f88
Tweaks rules for efficiency.
zorkow Nov 10, 2016
c44e053
Simplifies symbol lookup.
zorkow Nov 12, 2016
a78151a
Lints codebase.
zorkow Nov 12, 2016
bf2c0c2
Merge branch 'develop' into moss_project
zorkow Nov 28, 2016
47a9b52
Some renaming in dynamic trie.
zorkow Nov 2, 2016
1bff148
Initial introduction of a grammar structure.
zorkow Dec 6, 2016
05135e8
Moves font corrections to grammar object.
zorkow Dec 6, 2016
f48ec58
Add dev documetation.
zorkow Dec 6, 2016
f64bc7b
First fully working grammar context handling. Works for simple determ…
zorkow Dec 8, 2016
c701ceb
Repeated font handling via grammar structure.
zorkow Dec 9, 2016
98cfaf4
Adds preprocessing facility for grammar corrections.
zorkow Dec 9, 2016
72069cc
Purges grammar module of unnecessary methods.
zorkow Dec 10, 2016
d6eda47
General cleanup and linting of grammar and speech rule engine module.
zorkow Dec 10, 2016
7ed6d76
Merge branch 'moss_project' into trie
zorkow Jan 6, 2017
9ec214b
Reverses dependency order for dynamic constraints. Dynamic Constraint…
zorkow Jan 8, 2017
1f5e65c
Ensures default generation of global dynamic constraint and fallback …
zorkow Jan 8, 2017
e3f5ce4
Merge pull request #123 from zorkow/trie
zorkow Jan 8, 2017
5d1d959
Merge branch 'constraint_system' into test_merge
zorkow Jan 8, 2017
d1c8fa4
Merge branch 'grammatic_structure' into test_merge
zorkow Jan 8, 2017
280811a
Clean up of comments and linting the code base.
zorkow Jan 8, 2017
53b341f
Refactors speech rule personality attributes into a separate dictiona…
zorkow Jan 8, 2017
e39d5a3
Explicit grammar representation in speech rules.
zorkow Jan 9, 2017
b6bec08
Completes refactoring of grammar and attribute dictionaries in speech…
zorkow Jan 9, 2017
eaa8665
Adds specialised trie tests for grammar lookup and attribute inequality.
zorkow Jan 9, 2017
5a35911
Adjusts type handling and fixes some tests.
zorkow Jan 9, 2017
3c93b0d
Minor bug fixes and unit tests for attributes and grammars in speech …
zorkow Jan 10, 2017
aa18fb0
Incorporates review suggestions.
zorkow Jan 10, 2017
5a0ff60
Merge pull request #126 from zorkow/refactor_rule_attributes
zorkow Jan 10, 2017
76efd66
Adds omitted tests.
zorkow Jan 10, 2017
c38aba9
Fix some mathml store rules and add tests in preparation of testing c…
zorkow Jan 14, 2017
5bcf43e
Introduces efficient trie test to deal with namespaces.
zorkow Jan 14, 2017
911a314
Moves preprocessing into Grammar module.
zorkow Jan 14, 2017
3d6d564
Moves preprocessing into a grammar attribute.
zorkow Jan 14, 2017
768f187
Treatment of mathspeak digits in math maps.
zorkow Jan 15, 2017
f0f8d46
Updates mathmaps file for IE.
zorkow Jan 15, 2017
05b761a
Performs preprocessing immediately during auditory description creation.
zorkow Jan 15, 2017
cf084e0
Some cleanup and rename 'preprocess' to 'translate'.
zorkow Jan 15, 2017
55cedb6
Removes need to parse the grammar state.
zorkow Jan 15, 2017
1c09e4c
Some code cleanup.
zorkow Jan 15, 2017
4577de0
Lints code base.
zorkow Jan 16, 2017
a23e2de
Merge pull request #128 from zorkow/refactor_symbol_lookup
zorkow Jan 16, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
285 changes: 285 additions & 0 deletions doc/moss-project.org
@@ -0,0 +1,285 @@
# Musings/ideas/design decision on the MOSS project

* Constraint redesign


** Purpose: to improve on the applicability tests

** Three types of constraints: Dynamic, Query, Boolean

** Changes for Dynamic Constraints:

*** Remain basically as is. Make testing more efficient.

*** Can have a flexible number of axis.

**** However that number is fixed by the rule store (via the parser).

**** Otherwise similar to what we had

*** Comparison of Dynamic Constraints

**** DONE We need a parser to parse constraints from strings. This is specific to the rule store.

**** TODO Value list for

**** DONE Equality of constraints. This is a method of constraints.

**** TODO Have a priority order to compare against. This can be done globally, i.e., each

***** Parser order can be different from priority order!

***** Reference constraint is the global dynamic constraint

Example: mathspeak.brief < mathspeak.default if

***** Compare method for ordering with respect to the global constraint.
What flexibility do we allow for ordering? Is a simple order list of axis/attributes enough?
Or an order list per axis.

***** Match method to determine if we have a rule that is to be considered.
Allow sets of axis.
Example: mathspeak, [default, brief]

***** DO WE STILL WANT THE DEFAULT AS BOTTOM LINE?


** Changes for Static constraints:

*** Query vs boolean constraints

*** Query should be simplified, when possible, i.e. self::NAME

**** Markup as different forms

***** If query is of the form self::TAGNAME
Mark as 'tag' and test with node.tagname
Take care of self::*!
Take care of namespaces (e.g., self::mathml:math)

***** If query is of the form @attr
Mark as 'attr' and test with hasAttribute.

***** If query is of the form @attr="something"
Mark as 'attrEQ' and test with hasAttribute & getAttribute=

***** If query is of the form @attr!="something"
Mark as 'attrNEQ' and test with !hasAttribute || getAttribute!=

***** Test speed of the above against XPATH first!


**** Other speedup potential

***** count(children/*)=n

***** Usage of Xpath in postconditions

*** Inspect and mark constraints when sorting into Trie. Maybe annotate Trie node?


*** Ordering

**** Currently only by number.

**** Is there a better way?
Priorities?
Explicit ordering by name? Can be problematic as there can be multiple
rules with the same name.
If done by name, we could have an explicit order definition statement in a
rule store that would need to be collected and applied by a comparator.

**** Again ordering is independent of the Trie

** Changes to constraints of simple store elements

*** Special constraints: For single string elements

*** They only work on text nodes.

*** Rewriting application tests:

**** recognise the query self::text() and combine with boolean query.

**** Immediately do this when sorting into Trie

**** Stop building a "rule" query and instead have trie subtype that specialised on simple stores.

*** Again test speed trade-off!

** Dynamic Constraints vs Property Test Sets

*** What do we mean by this: Dynamic constraints are "fixed" to a rule.
They are the one that are tested against the properties chosen by the user.

*** Property sets are values for dynamic constraints that can be selected by the user.
Dynamic constraints are then tested against that set.

*** Fallbacks are dynamic constraints that a rule set can use to control, if defaults

**** This could get us tighter control, how we deal with fallbacks.

**** We could completely get rid of the "default" fallback and always force explicit fallback definition.

**** How does that work together with the strict setting?

***** When do we use strict?

***** If it is only used on particular sets, then they should not define fallbacks.

***** Simply do not use fallbacks when in strict mode.

** TODO Make rule definition robust against errors!

* Trie design ideas:

** Usage

*** Have one trie per rule store or per domain?

**** One trie per rule store. They get the query and constraint function as static function.

**** They get string matching with respect to axes for the dynamic constraints.

**** add/delete/findRule are being run on the Trie.

*** When combining rule sets, combine tries or rather search through list of tries?
The former is probably too expensive and also might not make sense when swapping rule sets regularly.
Is that a use case? How often would we swap rule sets?

*** What about the simple rule stores? Should we simply go back to using the dictionary lookup?

** Design

*** Trie starting with dynamic constraint? Yes

*** Trie starting with query? No

*** Combine rule stores are to be replaced by a combined trie

** DataStructure

*** Node with

**** Type (root, dynamic, query, boolean, rule = leaf?)


***** We should probably have a class per node type.

**** Leaf or Rule nodes are not necessary. Instead we have a abstract class of constraint nodes.
They can have a single rule come off.

**** SubType (string)

***** Dynamic: The axis name

***** Static: the form or xpath

***** Rule: name

**** The actual content (string)

***** Dynamic: value of axis

***** Static: xpath expression

***** Rule: postcondition or full rule object

**** Auxiliary content (static only) (string)

***** The comparison string or empty if none is necessary.

*** Should every node bring their own test? Or select test according to type?
Probably better the former, but has to be a static function!
Should be assigned during sub-type computation.
For dynamic computation that will be a bit problematic!
Dynamic match needs to use the global comparator.

*** Children implemented as Object.<string, node> where the string is the actual content.

*** Lookup of rules

**** Two types: Dynamic Constraint, Static Constraint

**** For dynamic constraints: Use order, test each constraint against a list of constraints.
E.g., [short, default].

***** Child node is accepted, if it is a dynamic node and constraint is member of given constraint list

***** or if it is a static node.
This means we have a node that has a shorter dynamic constraint spec.
These can be used as defaults.

**** For static constraints:

***** Child nodes is accepted, if the test returns true.

**** Collecting rules along the valid paths in the trie:

***** If a matching (static) node contains a rule, it is collected.

*** Depth and balancing might be interesting.
We could effectively invert order of dynamic and static constraints. Not sure if that makes any sense.

No, it does not. Better have some clever way of checking on the query
level. That is the bottle neck. E.g. try to only have tagname checks there (and *)


* Symbol mappings

** Could be done with a trie. But at the moment it turns out to be more efficient to leave as is.
See the abandoned tweak_simple_stores branch for a failed attempt.

** Maybe change mappings to contain entire dynamic constraints as they will get unwieldy with more axis.

** Give them a standard order: i.e., keep default, style: short>default

* Grammar structure

** General idea

*** Grammar elements are added and removed via personality annotations.

*** Keywords are mapped to either a string or a boolean.

*** XML element gets special grammar attribute with space separated list of keywords and strings, or boolean.

*** This way we can check in the rules if a grammatical case is applicable.

**** This only works for the next level. Needs to be repeated, if necessary.

**** Alternative, always propagate the grammar attribute.

*** Should work both for [n] and [m] nodes.

** To subsume preprocess, correction, remove, sre_flag, font/hiddenfont, annotation:unit
Got rid of font, sre_flag, remove.

** Singleton structure similar to global parameters

*** Holds mappings of grammar keywords to either strings or booleans.

*** Equipped with mappings to correction functions for certain grammar keywords.

** Dispatch in extra grammar keyword in personality annotations.

*** Test with determinant simple and hidden fonts.

*** New grammar syntax in personality annotations:

**** grammar:aa:bb="something":cc=@font:dd=CSFsomething:!ee
Note the overall separator is : as not to conflict with separators between
personality annotations.

***** Adds boolean aa

***** Adds bb with value something

***** Adds cc with font name of the current node

***** Adds dd with string computed by CSFsomething

***** Removes ee

*** We might want to have special function for grammar checking instead of the grammar attribute
Check after integration with the Trie (WP1.3)