Skip to content

C Interface

Jos Denys edited this page Oct 4, 2021 · 2 revisions

As of v1.3, the iKnow engine supports a C-interface, covering all engine functionality in a JSON encoded request/response style.

It's C definition is as follows :

//
// C function declaration
//
typedef int (*p_iknow_json)(const char* request, const char** response);

iKnowExplicitTest.cpp has been added as a module to give an overview of all possibilities:

    p_iknow_json iknow_json = get_iknow_json();

    const char* j_response;
    {
        string j_request(R"({"method" : "GetLanguagesSet"})");
        cout << endl << "request:" << j_request << endl;
        iknow_json(j_request.c_str(), &j_response);
        cout << "response:" << j_response << std::endl;
    }
    {
        string j_request(R"({"method" : "NormalizeText", "language":"fr", "text_source" : "Risque d'exploitation"})");
        cout << endl << "request:" << j_request << endl;
        iknow_json(j_request.c_str(), &j_response);
        cout << "response:" << j_response << std::endl;
    }
    {
        const char text_source[] = u8"Микротерминатор может развивать скорость до 30 сантиметров за секунду, пишут калининградские СМИ.";
        string j_request = R"({"method" : "IdentifyLanguage", "text_source" : ")" + string(text_source) + "\"}";
        cout << endl << "request:" << j_request << endl;
        iknow_json(j_request.c_str(), &j_response);
        cout << "response:" << j_response << std::endl;
    }
    {
        string j_request(R"({"method" : "index", "language" : "en", "text_source" : "This is a test of the Python interface to the iKnow engine. Be the change you want to see in life. Now, I have been on many walking holidays, but never on one where I have my bags ferried\nfrom hotel to hotel while I simply get on with the job of walkingand enjoying myself.", "b_trace" : true})");
        cout << endl << "request:" << j_request << endl;
        iknow_json(j_request.c_str(), &j_response);
        cout << "response:" << j_response << std::endl;
    }

The engine allocates the memory needed to store the response, the client should parse it for data extraction. The engine will cleanup the response buffer.

The "method" key in the json request selects the method requested, currently there are 4 :

  • GetLanguagesSet : retrieve the supported language set.
  • NormalizeText : normalize a piece of text.
  • IdentifyLanguage : identify the language of a piece of text.
  • index : the main function, indexing of a text source.

Following key/values are the usual method parameters, "GetLanguagesSet" has none :

request_json:
{"method":"GetLanguagesSet"}

response_json:
{
    "iknow_languages": [
        "cs",
        "de",
        "en",
        "es",
        "fr",
        "ja",
        "nl",
        "pt",
        "ru",
        "sv",
        "uk"
    ]
}

"NormalizeTest" has 2 parameters, "language" specifies the languages, "text_source" is the artefact that is the subject of normalization:

request_json:
{"language":"fr","method":"NormalizeText","text_source":"Risque d'exploitation"}

response_json:
{
    "normalized": "risque d' exploitation"
}

"IdentifyLanguage" has 1 parameter, "text_source", the artefact that is the subject of identification:

request_json:
{"method":"IdentifyLanguage","text_source":"Микротерминатор может развивать скорость до 30 сантиметров за секунду, пишут калининградские СМИ."}

response_json:
{
    "certainty": "0.832918",
    "language": "ru"
}

The main "index" method has 2 parameters : "language" and "text_source", it has one optional parameter : "b_trace": if true, all linguistic traces will be collected and added to the json response. If not specified, it defaults to false.

request_json:
{"b_trace":true,"language":"en","method":"index","text_source":"This is a test of the Python interface to the iKnow engine. Be the change you want to see in life. Now, I have been on many walking holidays, but never on one where I have my bags ferried\nfrom hotel to hotel while I simply get on with the job of walkingand enjoying myself."}

response_json:
{
    "proximity": [
        [
            [
                22,
                24
            ],
            106
        ],
        ...
    "sentences": {
        "1": {
            "attributes": [],
            "entities": [
                {
                    "dominance_value": 0.0,
                    "entity_id": 1,
                    "index": "this",
                    "offset_start": 0,
                    "offset_stop": 4,
                    "type": "PathRelevant"
                },
                {
                    "dominance_value": 71.0,
                    "entity_id": 2,
                    "index": "is",
                    "offset_start": 5,
                    "offset_stop": 7,
                    "type": "Relation"
                },
                {
                    "dominance_value": 0.0,
                    "entity_id": 0,
                    "index": "a",
                    "offset_start": 8,
                    "offset_stop": 9,
                    "type": "NonRelevant"
                },
                {
                    "dominance_value": 333.0,
                    "entity_id": 3,
                    "index": "test",
                    "offset_start": 10,
                    "offset_stop": 14,
                    "type": "Concept"
                },
                ...
            "path": [
                0,
                1,
                3,
                4,
                6,
                7,
                9
            ],
            "path_attributes": []
        },
        "2": {
            "attributes": [],
            "entities": [
                {
                    "dominance_value": 71.0,
                    "entity_id": 8,
                    "index": "be",
                    "offset_start": 60,
                    "offset_stop": 62,
                    "type": "Relation"
                },
         ...
           "path_attributes": [
                {
                    "position": 4,
                    "span": 11,
                    "type": "negation"
                },
                {
                    "position": 3,
                    "span": 1,
                    "type": "date_time"
                },
                {
                    "position": 15,
                    "span": 7,
                    "type": "positive_sentiment"
                }
              ...
    "traces": [
        "NormalizeToken:\"This\"=\"this\";",
        "LexrepCreated:<lexrep id=3 type=Unknown value=\"This\" index=\"this\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=4 type=Unknown value=\"is\" index=\"is\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=5 type=Unknown value=\"a\" index=\"a\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=6 type=Unknown value=\"test\" index=\"test\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=7 type=Unknown value=\"of\" index=\"of\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=8 type=Unknown value=\"the\" index=\"the\" labels=\"ENCon;\" />;",
        "NormalizeToken:\"Python\"=\"python\";",
        "LexrepCreated:<lexrep id=9 type=Unknown value=\"Python\" index=\"python\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=10 type=Unknown value=\"interface\" index=\"interface\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=11 type=Unknown value=\"to\" index=\"to\" labels=\"ENCon;\" />;",
        "LexrepCreated:<lexrep id=12 type=Unknown value=\"the\" index=\"the\" labels=\"ENCon;\" />;",
        "NormalizeToken:\"iKnow\"=\"iknow\";",
        ...

All parameters correspond to the C++ API interface, see the documentation for more detailled information. The JSON C interface is a serialized form of working with the iKnow engine.