Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TupleIterator from Java_karmaresearch_vlog_VLog_query is not proper deleted #94

Closed
andreschamschurko opened this issue Jan 24, 2022 · 4 comments

Comments

@andreschamschurko
Copy link

Hi,

when I call the method VLogReasoner.getInferences() from Rulewerk in Java multiple times, then the needed memory of the process increases continuously.
This happens independently of the used knowledge base as long as the knowledge base is not empty and the reasoner has to reason.

public void letItLeak() throws IOException, ParsingException
{
    final String sources = "@source s[3] : load-csv(\"s.csv\") .";

    KnowledgeBase knowledgeBase = new KnowledgeBase();
    RuleParser.parseInto(knowledgeBase, sources.toString());

    for (int i=0; i < 10000000; i++)
    {
        Reasoner reasoner = new VLogReasoner(knowledgeBase);
        reasoner.reason();

        reasoner.getInferences();

        reasoner.close();
    }
}

This happens even when the loop consists just of the call of the method getInferences(). And the rest of the calls can be done outside of the loop.

s.csv:

"alpha","beta","gamma"

The example from the top uses after a minute almost 20 GB of RAM.

Searching for a Java memory leak, I profiled the Java heap with the JProfiler.
But the indicated memory in JProfiler differed from the number of my taskmanager.
This difference increased with every second. After 40 s the JProfiler shows 4 GB of reserved memory, but my taskmanager told me the process used ~15 GB.
After starting the GarbageCollector manually, most of the 4 GB from the JProfiler were freed, which the taskmanager confirmed.

After that the used memory, shown in the taskmanager, was ~11 GB and still increasing.
An analysis with the run option "-verbose:gc" confirmed the numbers of the JProfiler.

From this I concluded that there is a memory leak outside of Java and tried to analyse the whole used memory of the method.

I analysed the method with the tool Heaptrack, which indicated a memory leak in libvlog-core.so.
The size of this memory leak was 16GB, which equals the difference between the indicated memory of JProfiler and my taskmanager.
After that I print debugged the method Java_karmaresearch_vlog_QueryResultIterator_cleanup in src/vlog/native/Vlog.cpp, the destructors of the TupleIterator and the TupleTableItr:

...
++ TupleTableItr: ctor 0x7f0c306e0660
-- JAVA QueryResultIterator.close() 0x7f0c306e0660
-- C++ cleanup called: 0x7f0c306e0660
-- TupleTableItr: dtor 0x7f0c306e0660
BASIS dtor TupleIterator 0x7f0c306e0660
++ TupleTableItr: ctor 0x7f0c306dd280
-- JAVA QueryResultIterator.close() 0x7f0c306dd280
-- C++ cleanup called: 0x7f0c306dd280
-- TupleTableItr: dtor 0x7f0c306dd280
BASIS dtor TupleIterator 0x7f0c306dd280
++ TupleTableItr: ctor 0x7f0c306e0660
-- JAVA QueryResultIterator.close() 0x7f0c306e0660
-- C++ cleanup called: 0x7f0c306e0660
-- TupleTableItr: dtor 0x7f0c306e0660
BASIS dtor TupleIterator 0x7f0c306e0660
...

This confirms that every created Iterator will be destroyed.

The suspicion was, that the memory leak is introduced by the TupleTableItr class.
To double check we replaced it by a dummy implementation DummyItr and use it in the function Java_karmaresearch_vlog_VLog_query.

    class DummyItr : public TupleIterator
    {
		char data_static[100000];
		char* data_dynamic;

		public:
			DummyItr()
			{
				std::cout << "++ DummyItr: ctor " << std::hex << (void*) this << std::endl;
				data_dynamic = new char[100000];
			}

			~DummyItr()
			{
				std::cout << "-- DummyItr: dtor " << std::hex << (void*) this << std::endl;
				delete [] data_dynamic;
			}

			bool hasNext() { return false; }
			void next() {}
			size_t getTupleSize() { return 0; }
			uint64_t getElementAt(const int pos) { return 0; }
	};

	/*
	 * Class:     karmaresearch_vlog_VLog
	 * Method:    query
	 * Signature: (I[JZZ)Lkarmaresearch/vlog/QueryResultIterator;
	 */
	JNIEXPORT jobject JNICALL Java_karmaresearch_vlog_VLog_query(JNIEnv * env, jobject obj, jint p, jlongArray els, jboolean includeConstants, jboolean filterBlanks) {
		if (p == -1) {
			throwNonExistingPredicateException(env, "Query contains non-existing predicate");
			return NULL;
		}

		// TupleIterator *iter = getQueryIter(env, obj, (PredId_t) p, els, includeConstants);
		TupleIterator *iter = new DummyItr();
		if (iter == NULL) {
			return NULL;
		}
		jclass jcls=env->FindClass("karmaresearch/vlog/QueryResultIterator");
		jmethodID mID = env->GetMethodID(jcls, "<init>", "(JZ)V");
		jobject jobj = env->NewObject(jcls, mID, (jlong) iter, filterBlanks);

		return jobj;
	}

In this case we do not observe any memory leak. The memory usage remains constant during the whole execution time.

This substantiates the earlier suspicion.

@CerielJacobs
Copy link
Contributor

Hi,
does getEDBIterator get called? I think I found a leak there, and I don't see any rules.

@andreschamschurko
Copy link
Author

Sorry, I use the following rule:

t1 (?c,?b,?a) :- s(?a,?b,?c) .

Yes, getEDBIterator() gets called:

++ C++ getIteratorWithMaterialization() called
++ TupleTableItr: ctor 0x7f00abffffb0
-- JAVA QueryResultIterator.close() 0x7f00abffffb0
-- C++ cleanup called: 0x7f00abffffb0
-- TupleTableItr: dtor 0x7f00abffffb0
BASIS dtor TupleIterator 0x7f00abffffb0
++ C++ getEDBIterator() called
++ TupleTableItr: ctor 0x7f03b0ce5ad0
-- JAVA QueryResultIterator.close() 0x7f03b0ce5ad0
-- C++ cleanup called: 0x7f03b0ce5ad0
-- TupleTableItr: dtor 0x7f03b0ce5ad0
BASIS dtor TupleIterator 0x7f03b0ce5ad0
++ C++ getIteratorWithMaterialization() called
++ TupleTableItr: ctor 0x7eff6bffffb0
-- JAVA QueryResultIterator.close() 0x7eff6bffffb0
-- C++ cleanup called: 0x7eff6bffffb0
-- TupleTableItr: dtor 0x7eff6bffffb0
BASIS dtor TupleIterator 0x7eff6bffffb0
++ C++ getEDBIterator() called
++ TupleTableItr: ctor 0x7f03b0ce5ad0
-- JAVA QueryResultIterator.close() 0x7f03b0ce5ad0
-- C++ cleanup called: 0x7f03b0ce5ad0
-- TupleTableItr: dtor 0x7f03b0ce5ad0
BASIS dtor TupleIterator 0x7f03b0ce5ad0

Your commit reduced the memory leak after the first minute to 4 GB.

@CerielJacobs
Copy link
Contributor

Hi,
I found another leak. I think it is OK now. Can you please try?

@andreschamschurko
Copy link
Author

Hi,
yes, your last commit fixed the rest of the leak, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants