Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange error on insertion with captures #1398

Closed
GavinMendelGleason opened this issue Aug 22, 2022 · 12 comments · Fixed by #1457
Closed

Strange error on insertion with captures #1398

GavinMendelGleason opened this issue Aug 22, 2022 · 12 comments · Fixed by #1457
Assignees
Labels
bug Something isn't working

Comments

@GavinMendelGleason
Copy link
Member

Using this schema: https://github.com/terminusdb-labs/alice_concordance/blob/main/schema/concordance.json

We get an error with the following insert:

echo '{
    "@type": "Paragraph",
    "@capture": ".paragraph 1 0",
    "text": "Down the Rabbit-Hole",
    "terms": [
      {
        "@type": "TermDF",
        "term": {
          "@ref": ".term Down"
        },
        "df": 1
      }
    ]
  },
  {
    "@type": "Term",
    "@capture": ".term Down",
    "tf": 1,
    "term": "Down"
  }' |  ./terminusdb doc insert admin/alice
@GavinMendelGleason GavinMendelGleason added the bug Something isn't working label Aug 22, 2022
@GavinMendelGleason
Copy link
Member Author

This appears to be an interaction between key strategy and using a ref.

@mplucinski
Copy link

It seems the same problem occurs for me with the following minimal example:

schema:

[
  {
    "@type": "@context",
    "@schema": "http://terminusdb.com/schema/woql#",
    "@base": "terminusdb://woql/data/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
  },
  {
    "@type": "Class",
    "@id": "Type1",
    "@key": {
      "@type": "Random"
    },
    "other": "Type1"
  }
]

data:

[
    {
        "@type": "Type1",
        "@capture": "obj-foo",
        "other": {"@ref": "obj-bar"}
    },
    {
        "@type": "Type1",
        "@capture": "obj-bar",
        "other": {"@ref": "obj-foo"}
    }
]

error on insert:

{
  "api:message":"Error: instantiation_error\n  [50] throw(error(instantiation_error,_34888))\n  [48] triplestore:insert(read_write_obj{descriptor:branch_graph{branch_name:\"main\",commit_type:'http://terminusdb.com/schema/ref#ValidCommit',database_name:\"test_db\",organization_name:\"admin\",repository_name:\"local\",type:instance},read:_34938,triple_update:false,write:<builder 67dde0ba51675f0f7e76ae79cd4c546c6ca68dd6>},'terminusdb://woql/data/Type1/27b002911d907290cc4627ebbec059c637616e497ac6ced0c2157eff4d01fafb','http://terminusdb.com/schema/woql#other',_34930,_34932) at /app/terminusdb/src/core/triple/triplestore.pl:261\n  [47] 'document/json':insert_document_expanded(transaction_object{commit_info:commit_info{},descriptor:branch_descriptor{branch_name:\"main\",repository_descriptor: ...},inference_objects:[],instance_objects:[...],parent:transaction_object{descriptor: ...,inference_objects:[],instance_objects: ...,parent: ...,schema_objects: ...},schema_objects:[...]},json{'@capture':\"obj-foo\",'@id':'terminusdb://woql/data/Type1/27b002911d907290cc4627ebbec059c637616e497ac6ced0c2157eff4d01fafb','@type':'http://terminusdb.com/schema/woql#Type1','http://terminusdb.com/schema/woql#other':json{'@id':_35130,'@ref':\"obj-bar\",'@type':\"@id\"}},'terminusdb://woql/data/Type1/27b002911d907290cc4627ebbec059c637616e497ac6ced0c2157eff4d01fafb') at /app/terminusdb/src/core/document/json.pl:2865\n  [45] utils:do_or_die('<garbage_collected>',error(document_insertion_failed_unexpectedly(...),_35180)) at /app/terminusdb/src/core/util/utils.pl:143\n  [42] utils:nb_thread_var('<garbage_collected>',state(t)) at /app/terminusdb/src/core/util/utils.pl:1175\n  [41] '<meta-call>'('<garbage_collected>') <foreign>\n  [40] findall_loop('terminusdb://woql/data/Type1/27b002911d907290cc4627ebbec059c637616e497ac6ced0c2157eff4d01fafb','<garbage_collected>',_35294,[]) at /usr/lib/swipl/boot/bags.pl:99\n  [39] setup_call_catcher_cleanup('$bags':'$new_findall_bag','$bags':findall_loop('terminusdb://woql/data/Type1/27b002911d907290cc4627ebbec059c637616e497ac6ced0c2157eff4d01fafb',...,_35352,[]),_35330,'$bags':'$destroy_findall_bag') at /usr/lib/swipl/boot/init.pl:663\n  [34] '<meta-call>'('<garbage_collected>') <foreign>\n  [33] call('<garbage_collected>') at /usr/lib/swipl/boot/init.pl:499\n  [32] catch(database:call(...),fail_transaction,database:(_35478=true)) at /usr/lib/swipl/boot/init.pl:562\n  [31] database:with_transaction_(query_context{all_witnesses:false,authorization:'terminusdb://system/data/User/admin',bindings:[],commit_info:commit_info{author:admin,message:data},default_collection:branch_descriptor{branch_name:\"main\",repository_descriptor: ...},files:[],filter:type_filter{types: ...},prefixes:_35584{'@base':\"terminusdb://woql/data/\",'@schema':\"http://terminusdb.com/schema/woql#\",'@type':'Context',api:'http://terminusdb.com/schema/api#',json:'http://terminusdb.com/schema/json#',owl:'http://www.w3.org/2002/07/owl#',rdf:'http://www.w3.org/1999/02/22-rdf-syntax-ns#',rdfs:'http://www.w3.org/2000/01/rdf-schema#',sys:'http://terminusdb.com/schema/sys#',vio:'http://terminusdb.com/schema/vio#',woql:'http://terminusdb.com/schema/woql#',xdd:'http://terminusdb.com/schema/xdd#',xsd:\"http://www.w3.org/2001/XMLSchema#\"},selected:[],system:system_descriptor{},transaction_objects:[...],update_guard:_35566,write_graph:branch_graph{branch_name:\"main\",database_name:\"test_db\",organization_name:\"admin\",repository_name:\"local\",type:instance}},api_document:(...,...),_35520) at /app/terminusdb/src/core/transaction/database.pl:234\n  [30] setup_call_catcher_cleanup(database:pre_transaction_tabling,database:with_transaction_(...,...,_35766),_35744,database:post_transaction_tabling) at /usr/lib/swipl/boot/init.pl:663\n  [27] api_document:api_insert_documents('<garbage_collected>','terminusdb://system/data/User/admin','admin/test_db',<stream>(0x55e920dd2800),no_data_version,_35818,'<garbage_collected>','<garbage_collected>') at /app/terminusdb/src/core/api/api_document.pl:249\n  [26] '<meta-call>'('<garbage_collected>') <foreign>\n  [25] catch(routes:(...,...),error(instantiation_error,context(_35912,_35914)),routes:do_or_die(...,...)) at /usr/lib/swipl/boot/init.pl:562\n  [24] catch_with_backtrace('<garbage_collected>','<garbage_collected>','<garbage_collected>') at /usr/lib/swipl/boot/init.pl:629\n\nNote: some frames are missing due to last-call optimization.\nRe-run your program in debug mode (:- debug.) to get more detail.\n\n",
  "api:status":"api:server_error"
}

extracted stacktrace:

Error: instantiation_error
  [50] throw(error(instantiation_error,_34888))
  [48] triplestore:insert(read_write_obj{descriptor:branch_graph{branch_name:"main",commit_type:'http://terminusdb.com/schema/ref#ValidCommit',database_name:"test_db",organization_name:"admin",repository_name:"local",type:instance},read:_34938,triple_update:false,write:<builder 3c621493365c9a5fc73121d1bbe636cb58cc7770>},'terminusdb://woql/data/Type1/8d20c5717245b12fe3ddb83b90d5c00e4122367ceafdb194bee0a9a013c5628d','http://terminusdb.com/schema/woql#other',_34930,_34932) at /app/terminusdb/src/core/triple/triplestore.pl:261
  [47] 'document/json':insert_document_expanded(transaction_object{commit_info:commit_info{},descriptor:branch_descriptor{branch_name:"main",repository_descriptor: ...},inference_objects:[],instance_objects:[...],parent:transaction_object{descriptor: ...,inference_objects:[],instance_objects: ...,parent: ...,schema_objects: ...},schema_objects:[...]},json{'@capture':"obj-foo",'@id':'terminusdb://woql/data/Type1/8d20c5717245b12fe3ddb83b90d5c00e4122367ceafdb194bee0a9a013c5628d','@type':'http://terminusdb.com/schema/woql#Type1','http://terminusdb.com/schema/woql#other':json{'@id':_35130,'@ref':"obj-bar",'@type':"@id"}},'terminusdb://woql/data/Type1/8d20c5717245b12fe3ddb83b90d5c00e4122367ceafdb194bee0a9a013c5628d') at /app/terminusdb/src/core/document/json.pl:2865
  [45] utils:do_or_die('<garbage_collected>',error(document_insertion_failed_unexpectedly(...),_35180)) at /app/terminusdb/src/core/util/utils.pl:143
  [42] utils:nb_thread_var('<garbage_collected>',state(t)) at /app/terminusdb/src/core/util/utils.pl:1175
  [41] '<meta-call>'('<garbage_collected>') <foreign>
  [40] findall_loop('terminusdb://woql/data/Type1/8d20c5717245b12fe3ddb83b90d5c00e4122367ceafdb194bee0a9a013c5628d','<garbage_collected>',_35294,[]) at /usr/lib/swipl/boot/bags.pl:99
  [39] setup_call_catcher_cleanup('$bags':'$new_findall_bag','$bags':findall_loop('terminusdb://woql/data/Type1/8d20c5717245b12fe3ddb83b90d5c00e4122367ceafdb194bee0a9a013c5628d',...,_35352,[]),_35330,'$bags':'$destroy_findall_bag') at /usr/lib/swipl/boot/init.pl:663
  [34] '<meta-call>'('<garbage_collected>') <foreign>
  [33] call('<garbage_collected>') at /usr/lib/swipl/boot/init.pl:499
  [32] catch(database:call(...),fail_transaction,database:(_35478=true)) at /usr/lib/swipl/boot/init.pl:562
  [31] database:with_transaction_(query_context{all_witnesses:false,authorization:'terminusdb://system/data/User/admin',bindings:[],commit_info:commit_info{author:admin,message:data},default_collection:branch_descriptor{branch_name:"main",repository_descriptor: ...},files:[],filter:type_filter{types: ...},prefixes:_35584{'@base':"terminusdb://woql/data/",'@schema':"http://terminusdb.com/schema/woql#",'@type':'Context',api:'http://terminusdb.com/schema/api#',json:'http://terminusdb.com/schema/json#',owl:'http://www.w3.org/2002/07/owl#',rdf:'http://www.w3.org/1999/02/22-rdf-syntax-ns#',rdfs:'http://www.w3.org/2000/01/rdf-schema#',sys:'http://terminusdb.com/schema/sys#',vio:'http://terminusdb.com/schema/vio#',woql:'http://terminusdb.com/schema/woql#',xdd:'http://terminusdb.com/schema/xdd#',xsd:"http://www.w3.org/2001/XMLSchema#"},selected:[],system:system_descriptor{},transaction_objects:[...],update_guard:_35566,write_graph:branch_graph{branch_name:"main",database_name:"test_db",organization_name:"admin",repository_name:"local",type:instance}},api_document:(...,...),_35520) at /app/terminusdb/src/core/transaction/database.pl:234
  [30] setup_call_catcher_cleanup(database:pre_transaction_tabling,database:with_transaction_(...,...,_35766),_35744,database:post_transaction_tabling) at /usr/lib/swipl/boot/init.pl:663
  [27] api_document:api_insert_documents('<garbage_collected>','terminusdb://system/data/User/admin','admin/test_db',<stream>(0x5580bbf3e300),no_data_version,_35818,'<garbage_collected>','<garbage_collected>') at /app/terminusdb/src/core/api/api_document.pl:249
  [26] '<meta-call>'('<garbage_collected>') <foreign>
  [25] catch(routes:(...,...),error(instantiation_error,context(_35912,_35914)),routes:do_or_die(...,...)) at /usr/lib/swipl/boot/init.pl:562
  [24] catch_with_backtrace('<garbage_collected>','<garbage_collected>','<garbage_collected>') at /usr/lib/swipl/boot/init.pl:629

Note: some frames are missing due to last-call optimization.
Re-run your program in debug mode (:- debug.) to get more detail.

@GavinMendelGleason
Copy link
Member Author

The example above appears to work in the 'main' branch:

gavin@titan:~/dev/terminusdb$ echo '[
  {
    "@type": "@context",
    "@schema": "http://terminusdb.com/schema/woql#",
    "@base": "terminusdb://woql/data/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
  },
  {
    "@type": "Class",
    "@id": "Type1",
    "@key": {
      "@type": "Random"
    },
    "other": "Type1"
  }
]' | terminusdb doc insert -f -g schema admin/captures
> > > > > > > > > > > > > > > Documents inserted:
 1: Type1
gavin@titan:~/dev/terminusdb$ echo '[
    {
        "@type": "Type1",
        "@capture": "obj-foo",
        "other": {"@ref": "obj-bar"}
    },
    {
        "@type": "Type1",
        "@capture": "obj-bar",
        "other": {"@ref": "obj-foo"}
    }
]' | terminusdb doc insert admin/captures
> > > > > > > > > > > Documents inserted:
 1: terminusdb://woql/data/Type1/4800cd2290789bc6a164b89cce60eded097ab3b0bb13fba27a0d1c47deaea05a
 2: terminusdb://woql/data/Type1/40032be865910ce1622a7aa7cf386504213634a96ce9bd52d6adb1387834cb08

@mplucinski
Copy link

Hm, I should have mentioned that I used the version provided by terminusdb-bootstrap. I'll retry with the latest main if I manage to build it.

@GavinMendelGleason
Copy link
Member Author

The bootstrap version can be changed to 'dev' and it will track the main branch in development. This might be easier than building yourself if that proves difficult.

@mplucinski
Copy link

I just tried 'dev' (since my local build crashes on "store init" - will file a separate bug for that), but that didn't help. Not sure what I'm doing wrong. Here's the script I reproduce the problem with - maybe I'm doing something wrong there?

#!/usr/bin/env python3
import os
import pathlib
import requests
import subprocess
import time

def run(cmd):
	print('    ---> ', cmd)
	subprocess.run(cmd)

def post(url, **kwargs):
	print('    ---> ', url)
	return requests.post(url, auth=requests.auth.HTTPBasicAuth('admin', 'root'), **kwargs)

os.environ['TERMINUSDB_TAG'] = 'dev'

if not pathlib.Path('terminusdb-bootstrap').exists():
	run(['git', 'clone', 'https://github.com/terminusdb/terminusdb-bootstrap.git'])

run(['terminusdb-bootstrap/terminusdb-container', 'stop'])
run(['terminusdb-bootstrap/terminusdb-container', 'rm'])
run(['terminusdb-bootstrap/terminusdb-container', 'run'])
time.sleep(1)
run(['terminusdb-bootstrap/terminusdb-container', 'cli', 'db', 'create', 'test_db'])

with open('schema.json', 'br') as f:
	r = post(
		'http://localhost:6363/api/document/admin/test_db?author=admin&message=schema&graph_type=schema&full_replace=true',#
		headers={'Content-Type': 'application/json'},
		data=f
	)
print(r, r.text)

with open('data.json', 'br') as f:
	r = post(
		'http://localhost:6363/api/document/admin/test_db?author=admin&message=data&full_replace=true',#
		headers={'Content-Type': 'application/json'},
		data=f,
	)
print(r)
print(r.text)
print(r.json()['api:message'])

@matko matko self-assigned this Sep 5, 2022
@matko
Copy link
Member

matko commented Sep 9, 2022

I too can reproduce the bug with that script. I'm looking into it.

@matko
Copy link
Member

matko commented Sep 9, 2022

The bug was triggered by the full_replace flag, which leads to a slightly different code path. In that code path, suspending document inserts until all its refs are known was not implemented properly, causing it to attempt to do the insert with that bit of data unbound. That led to the stack trace.

This should be fixed in main as soon as #1457 is merged.

@matko
Copy link
Member

matko commented Sep 9, 2022

@GavinMendelGleason your original reported error is actually different from what @mplucinski has been reporting, but I am not sure if I can reproduce. The linked schema does not contain a Paragraph type, and the insert complains about this. That has nothing to do with captures, so I suspect the bug was with an earlier version of your schema.

@GavinMendelGleason
Copy link
Member Author

Sorry, I should have linked the commit. I'll put up a minimal example.

@matko
Copy link
Member

matko commented Sep 13, 2022

Reopening for original reason.

@GavinMendelGleason
Copy link
Member Author

I can no longer reproduce.

See the following schema:

[
  { "@type" : "Class",
    "@id" : "Document",
    "@key" : { "@type" : "Random" },
    "text" : "xsd:string",
    "terms" : { "@type" : "Set",
                "@class" : "TermCount"}},

  { "@type" : "Class",
    "@id" : "Term",
    "@key" : { "@type" : "Lexical",
               "@fields" : ["term"]},
    "term" : "xsd:string",
    "documents" : { "@type" : "Set",
                    "@class" : "Document-TF-IDF" }},

  { "@type" : "Class",
    "@id" : "TermCount",
    "@key" : { "@type" : "Random" },
    "@subdocument" : [],
    "term" : "Term",
    "count" : "xsd:integer" },

  { "@type" : "Class",
    "@id" : "Document-TF-IDF",
    "@key" : { "@type" : "Random" },
    "@subdocument" : [],
    "document" : "Document",
    "tf_idf" : "xsd:decimal" }
]

And the following instance data:

[{
    "@type": "Document",
    "@capture": ".document 1 0",
    "text": "Down the Rabbit-Hole",
    "terms": [
      {
        "@type": "TermCount",
        "term": {
          "@ref": ".term Down"
        },
        "count": 1
      }
    ]
},
{
    "@type": "Term",
    "@capture": ".term Down",
    "term": "Down"
}]

Now try:

terminusdb doc insert admin/terms -g schema < ref_schema.json
terminusdb doc insert admin/terms < ref_instance.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants