Skip to content

C User Dictionary

Jos Denys edited this page Jul 16, 2021 · 4 revisions

The User Dictionary has it's own class definition, see udct example for a working demonstration.

The current implementation allows for forcing or suppressing a sentence end condition, and semantic tagging of lexical representations that are not part of the language model.

class IKNOW_API UserDictionary
{
public:
	// Clear the User Dictionary object
	void clear();

	// Tag User Dictionary label to a lexical representation for customizing purposes
	// Currently available labels are : "UDNegation", "UDPosSentiment", "UDNegSentiment", "UDConcept", "UDRelation", "UDNonRelevant", "UDUnit", "UDNumber" and "UDTime"
	// Returns iKnowEngine::iknow_unknown_label if an invalid label is passed as parameter.
	int addLabel(const std::string& literal, const char* UdctLabel);

	// Add User Dictionary literal rewrite, *not* functional, added for compatibility with the IRIS implementation.
	// The purpose is to rewrite like "dr." to "doctor", to aggregate similar lexical representations.
	void addEntry(const std::string& literal, const std::string& literal_rewrite);

	// Add User Dictionary EndNoEnd, enables force/suppress sentence end conditions.
	// See iKnowUnitTests::test5() for an example.
	void addSEndCondition(const std::string& literal, bool b_end = true);

	// Shortcut for known UD labels
	void addConceptTerm(const std::string& literal); // tag literal as a concept
	void addRelationTerm(const std::string& literal); // tag literal as a relation
	void addNonrelevantTerm(const std::string& literal); // tag literal as a non-relevant

	void addUnitTerm(const std::string& literal); // tag literal as a unit
	void addNumberTerm(const std::string& literal); // tag literal as a number
	void addTimeTerm(const std::string& literal); // tag literal as a time indicator
	void addNegationTerm(const std::string& literal); // tag literal as a negation
	void addPositiveSentimentTerm(const std::string& literal); // tag literal as a positive sentiment
	void addNegativeSentimentTerm(const std::string& literal); // tag literal as a negative sentiment

	void addGeneric1(const std::string& literal); // tag literal as generic1
	void addGeneric2(const std::string& literal); // tag literal as generic2
	void addGeneric3(const std::string& literal); // tag literal as generic3

	int addCertaintyLevel(const std::string& literal, int level = 0); // add a certainty level

The iKnow engine has 2 methods to handle a user dictionary :

class IKNOW_API iKnowEngine
{
public:
	enum errcodes {
		iknow_language_not_supported = -1, // unsupported language
		iknow_unknown_label = -2	// udct addLabel : label does not exist
	};

	// User dictionary methods :
	//     loadUserDictionary : will load *and* activate the user dictionary object, if a previously one is active, it will be unloaded and deactivated, will throw an exception if the udct object cannot be loaded.
	//     unloadUserDictionary : will unload and deactivate the active user dictionary.
	void loadUserDictionary(UserDictionary& udct);
	void unloadUserDictionary(void);

The iKnow indexer engine handles only one user dictionary, to load and activate use loadUserDictionary(), to deactivate use unloadUserDictionary(). If a dictionary is loaded, calling loadUserDictionary() will unload the previous one, and load the new one.
You can reuse an existing user dictionary object by calling it's clear() method.