-
Notifications
You must be signed in to change notification settings - Fork 2
Doesn't like my config files. #1
Comments
@whikloj, this is a semi-known issue that has to do with avoiding classpath problems that arise when pulling in the gargantuan Fedora server classpath. I know the problem and have a fix, which I will get taken care of sometime in the next few days (Tgiving holiday). In the meantime, there is a workaround that @ruebot knows or with which I can help you via IRC in the next day or so. It involves removing all of the |
Ok, thanks. I'll bug @ruebot about it tomorrow. No rush. |
I got yo back!
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
<bean name="objectStore" class="org.akubraproject.map.IdMappingBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:objectStore" />
<constructor-arg>
<ref bean="fsObjectStore" />
</constructor-arg>
<constructor-arg>
<ref bean="fsObjectStoreMapper" />
</constructor-arg>
</bean>
<bean name="fsObjectStore" class="org.akubraproject.fs.FSBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:fsObjectStore" />
<constructor-arg value="/md1200/vol1/fedora_data/objectStore"/>
</bean>
<bean name="fsObjectStoreMapper"
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper"
singleton="true">
<constructor-arg value="##" />
</bean>
<bean name="datastreamStore" class="org.akubraproject.map.IdMappingBlobStore"
singleton="true">
<constructor-arg value="urn:fedora:datastreamStore" />
<constructor-arg>
<ref bean="fsDatastreamStore" />
</constructor-arg>
<constructor-arg>
<ref bean="fsDatastreamStoreMapper" />
</constructor-arg>
</bean>
<bean name="fsDatastreamStore" class="org.akubraproject.fs.FSBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:fsDatastreamStore" />
<constructor-arg value="/md1200/vol1/fedora_data/datastreamStore"/>
</bean>
<bean name="fsDatastreamStoreMapper"
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper"
singleton="true">
<constructor-arg value="##" />
</bean>
</beans> |
...and @whikloj
...and you'll want to use the |
If you are expecting to use this with the trippi-sparql connector, then the simplest thing to do graphname-wise is exactly what @ruebot writes. I need to document what's going on there better. (Short story: the |
@whikloj I have a much simpler workaround to try: please try adding a single attribute |
Cool, I'm just rebuilding some derivatives and then I'll give this a try. |
Ok so the problem in my akubra-llstore.xml still exists, adding I'm trying @ruebot's file example as I think the |
@ruebot's example should certainly work, but it is odd that the |
Wait, I think you are wrong-- it is not failing, because you are getting to here. I think you are fine. You are just seeing warnings, not errors. Are you getting triples? |
@ajs6f TRIPLES!!!! |
I'll see what I can do to hide those annoying and confusing stacktraces. Meanwhile, enjoy your Usan Thanksgiving triples. |
So this is working with @ruebot's modified |
My run finally completed, I will try to start it again using the original akubra-llstore.xml. My quad file contains 121,819,261 lines (or quads), but a count query of my entire Mulgara has 125,262,308 which leaves 3,443,047 not accounted for. Is it possible that there are internal triples that would not be persisted on the object in the filesystem? |
It's not obvious that there would be any such triples. My first guess would be that some objects or datastreams weren't readable at the moment that mattered. Can you check the content of the difference by diffing the output of the hot indexer against a complete NQuads dump of Mulgara (you will need to sort them first)? I appreciate that so doing will take a lot of time and computation, but hopefully not too much? I'd like to know what the actual differences are before theorizing. |
I'm not sure I can get NQuads from Mulgara...checking into that. But you are right it took a little but using the |
Okay, to the latter, good, I will update the README to that effect and it will doubtless help others. To the former, you can always dump NTriples out of |
I am currently testing the new "avoid piling up URIs in a list" commits and I will let you know as soon as I am confident in them. |
Yeah, there are commands to do a backup of Mulgara, but they require access to the server. I'm looking in the fcrepo3 code but I don't know that either a) mulgara is running separately or b) if it is that the server is exposed at all. I wanted to try this, and I got the client library but I need the host:port to connect to. I tried doing a query and I have a choice of xml and json. So it would require exporting it all, then transforming it all into n-quads, then sorting, then comparing. So this might take some time. |
You should be able to use a query at the |
@ajs6f++ Why is Fedora's Mulgara documentation better than Mulgara's own?! Crazy, this is working. I'll start it now. |
Well, Fedora 3 remained under maintenance for years after Mulgara wasn't, so that's probably got somewhat to do with it. |
Okay, @whikloj , I've committed the new streaming code. Please try it out-- it should get rid of that annoying delay before triples start arriving. Although it won't do anything for your slow storage.... |
Ok it took a bit but I have an N-Quad file of all my triples from Mulgara, then I sorted both files (at 17GB a piece that took some time and space). I couldn't use For now I will say it is obvious there is stuff in Mulgara that is not in the rdf-extractor output. Simple
versus
What is confusing me is where did Mulgara get |
Did you clean out Mulgara before reindexing into it? |
Actually, looks like you might be okay using embedded Mulgara in particular: https://github.com/fcrepo3/fcrepo/blob/master/fcrepo-server/src/main/java/org/fcrepo/server/resourceIndex/ResourceIndexRebuilder.java#L191 |
Can you verify that the directory containing Mulgara's data was created at the datetime of your last full rebuild? |
Yeah I remember it says that it is cleaning it out and the directory was created on October 28. I thought it was more recent but that is probably correct. |
I'm scanning the objectStore for a file starting with I'm gonna try writing a little python script program to compare the files line by line and create a less memory intensive (but probably time intensive) diff, first gotta do some weekend stuff. I'll check back later. |
Well, is it a problem with the rebuilder or the hot indexer? In other words, are those extra triples actually generated from real objects, or not? E.g. is there a |
Yes apparently there is. Seemingly we have some very old vendor test
objects in our repository.
So I should remove them, but that doesn't explain why the hot indexer
didn't seem to find them.
…On 26 Nov 2016 11:20 a.m., "A. Soroka" ***@***.***> wrote:
Well, is it a problem with the rebuilder or the hot indexer? In other
words, are those extra triples actually generated from real objects, or
not? E.g. is there a alan:testObject2 in the repo?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACua4VQhEjd_VDg9L4iRCWgxyN3e2Yokks5rCGp0gaJpZM4K4vnG>
.
|
No, you are right about that. I know it must be a large file, but can I get access to the log of your hot indexer run somewhere? |
Actually, @whikloj , can you close this ticket (because we got the prob with your conf file resolved, at least to first order) and open a new one specifically about the missed objects? |
Absolutely |
Tried to use this but am having trouble with my Akubra storage config file.
My akubra-llstore.xml -> https://gist.github.com/whikloj/584dea271c6e872e4b3d574676781bcc
The text was updated successfully, but these errors were encountered: