This repository is for creating a JAR file to populate HBASE tables with enterprises with one or more legal units (as oppososed to the single legal unit in sbr-enterprise-assemble). The jar is uploaded to HDFS to be run via a spark2-submit command on an edgenode terminal.
- Java 8 or higher
- SBT (Download)
To install SBT quickly you can use Homebrew (Brew):
brew install sbt
Similarly we can get Scala (for development purposes) using brew:
brew install scala
Install HBase locally with brew by using:
brew install hbase
By default, the app will run locally. To run it in cluster mode, you must pass in the 11th paramater as "cluster".
HBase can be started locally by:
start-hbase.sh
Now that HBase has started, we can open the shell and create the namespace and tables.
hbase shell
create_namespace 'sbr_dev_db'
create 'ons:ENTERPRISE', 'd'
create 'ons:LINKS', 'l'
To compile, build and run the application use the following command:
sbt run
The application should have built, created some Hfiles and used them to populate the HBase table. The Hfiles should be located at src/main/resources and should be directories titled enterprises and links.
You can then query the HBase tables to see if they populated correctly, using these commands in the HBase shell:
scan "sbr_dev_db:links", {LIMIT => 200}
scan "sbr_dev_db:enterprise", {LIMIT => 200}
Copyright © 2017, Office for National Statistics (https://www.ons.gov.uk)