Skip to content
This repository has been archived by the owner on May 24, 2022. It is now read-only.

finc Solr Schema Update in D:SWARM

Bo Ferri edited this page Dec 5, 2016 · 9 revisions

note: the following guide/script could also be simplified/generalized, if required/necessary (feel free to drop a line ;) ). Currently, it describes how to update a concrete (already existing) inbuilt schema in d:swarm from a concrete, given Solr schema (the finc Solr schema).

Update D:SWARM finc Solr schema

  1. copy the new finc Solr schema into (see here)

     [D:SWARM backend repository home dir]/converter/src/test/resources/finc-solr-schema.xml
    
    1. note: you may keep track of fields that have been static before and are now dynamic (e.g. 'format_*'), i.e., d:swarm does only parse fields that are static (since it doesn't know all possible values of a dynamic field)
  2. run the unit test at (see here)

     org.dswarm.converter.schema.test.SolrSchemaParserTest#testFincSolrSchemaCreation()
    
  3. this test will fail, if the schema has changed (in a diff view, you should also be able to see the differences between the attribute paths of the old schema and the new schema), i.e. 1. copy the content from the actual processing result to (see here)

       [D:SWARM backend repository home dir]/converter/src/test/resources/finc-solr-schema_-_attribute_paths.txt
    
2. run the test again (now it should succeed)
  1. re-create the builtin schemata

  2. enable the test at (see here)

     org.dswarm.converter.schema.test.BuildInitInternalSchemaScriptTest#buildScript()
    
  3. run this test in debug mode (!) and set a breakpoint at the last line of this test method (i.e. CmdUtil.runCommand(sb.toString(), output); (see here); because the file writing in this test seems to be broken right now) 1. note: test execution may take a while (so don't worry) 2. when the test interrupts at the described breakpoint, then stop this test (this will save the state in the metadata repository) 3. dump the metadata repository state into the init metadata repository script with

       mysqldump -u[metadata repository user] -p[metadata repository user password] --no-create-info --no-create-db --skip-triggers --skip-create-options --skip-add-drop-table --skip-lock-tables --skip-add-locks -B [metadata repository database] > [D:SWARM backend repository home dir]/src/main/resources/init_internal_schema.sql
    
  4. commit the changes => I would recommend to do as following: 1. commit the changed source files ([D:SWARM backend repository home dir]/converter/src/test/resources/finc-solr-schema.xml + [D:SWARM backend repository home dir]/converter/src/test/resources/finc-solr-schema_-_attribute_paths.txt) in a separate commit 2. disable the test at org.dswarm.converter.schema.test.BuildInitInternalSchemaScriptTest#buildScript() (+ clean-up the imports; alternatively you can also simply revert the changes made in this file => i.e. no (unnecessary) changes in org.dswarm.converter.schema.test.BuildInitInternalSchemaScriptTest should be committed) 3. commit the updated init metadata repository script ([D:SWARM backend repository home dir]/src/main/resources/init_internal_schema.sql) in a second commit

Update Solr XML DataSource Config for finc Solr Schema

  1. run the unit test at (see here)

org.dswarm.converter.schema.test.SolrXMLDataSourceConfigGeneratorTest#testSolrXMLDataSourceConfigGenerator()

  1. this test will fail, if the schema has changed (in a diff view, you should also be able to see the differences), i.e. 1. copy the content from the actual processing result to (see here)

       [D:SWARM backend repository home dir]/converter/src/test/resources/expected-data-config.xml
    
    1. therefore, you need to run the test in debug mode (!) and set a breakpoint at line 85 (Assert.assertNotNull(actualDataConfig);)
    2. when the test interrupts at the describe breakpoint, then copy the content from the actualDataConfig variable into the described file (from above; for readability purpose it's recommend to re-format this file afterwards (e.g. with help of your IDE or editor)) and stop this test run afterwards 2. run the test again (not in debug mode now; now it should succeed)
  2. repeat the steps from 1. for following test as well (see here)

org.dswarm.converter.schema.test.SolrXMLDataSourceConfigGeneratorTest#testSolrXMLDataSourceConfigGenerator2()

  1. note: the file for the expected result is located at (see here)

     [D:SWARM backend repository home dir]/converter/src/test/resources/expected-data-config2.xml
    
  2. commit both (changed) expected test result files

note: the Solr XML DataSource Config file that should be utilised in your finc Solr configuration should be (see here)

[D:SWARM backend repository home dir]/converter/src/test/resources/expected-data-config2.xml
Clone this wiki locally