This changeset allows to parse huge turtle, trig and n3 files. Huge hereby means file sizes bigger than the main memory. It has been tested on the GND dataset of the Deutsche Nationalbibliothek (121 MTriples), the dbpedia dataset (583 MT) and a private production dataset (1225 MT). Previously, the turtle parser tried to stack up all input in a huge buffer which it then proceeded to process at once. This changeset introduces a parser that attempts to parse each given chunk immediately. Syntax errors that arise due to end-of-buffer situations in the middle of a grammar rule are accounted for by resolving statements using the special `error' rule accompanied with error recovery that copies over the remainder of the buffer to the beginning so it can be appended by the next chunk. Full turtle statements (the ones ending in DOT) will never be part of the remainder. However, because of blank nodes and collections statements can't be issued immediately anymore, instead the concept of deferring the emission of a statement is introduced. This is to avoid dangling (bnodeid) statements in case a turtle SPO statement isn't DOT ended yet but the blank node property list or collection has been read already. * struct raptor_turtle_parser_s: introduce slots for buffer book keeping * turtle_lexer.l: use YY_USER_ACTION to keep track of buffer consumption * turtle_parser.y raptor_turtle_generate_statement(): split in two, see following raptor_turtle_clone_statement(): prepare statement for handling raptor_turtle_handle_statement(): call a parser's statement handler raptor_turtle_defer_statement(): like raptor_turtle_generate_statement() but instead of calling the statement handler immediately put it on a list of deferred statements, called (handled) only if the statement rule path has been taken (triples DOT) raptor_turtle_parse_chunk(): begin parsing on chunks for every call, only stack up things in buffers if the remainder of a chunk has been resolved through the `error' rule.
struct raptor_turtle_context gains fields mkr_rs_size / arity / ntuple / nvalue / processing_value (raptor_mkr_emit_subject_resultset): Switch statics to context vars above.