Skip to content

ONSdigital/sbr-idbr-data-load

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sbr-idbr-data-load

license Dependency Status

What is this repository?

This repository is for creating a JAR file to populate HBASE tables with enterprises with one or more legal units (as oppososed to the single legal unit in sbr-enterprise-assemble). The jar is uploaded to HDFS to be run via a spark2-submit command on an edgenode terminal.

Prerequisites

Development Setup (MacOS)

To install SBT quickly you can use Homebrew (Brew):

brew install sbt

Similarly we can get Scala (for development purposes) using brew:

brew install scala

Install HBase locally with brew by using:

brew install hbase

Running the App

By default, the app will run locally. To run it in cluster mode, you must pass in the 11th paramater as "cluster".

HBase

HBase can be started locally by:

start-hbase.sh

Now that HBase has started, we can open the shell and create the namespace and tables.

hbase shell
create_namespace 'sbr_dev_db'
create 'ons:ENTERPRISE', 'd'
create 'ons:LINKS', 'l'

To compile, build and run the application use the following command:

sbt run

The application should have built, created some Hfiles and used them to populate the HBase table. The Hfiles should be located at src/main/resources and should be directories titled enterprises and links.

You can then query the HBase tables to see if they populated correctly, using these commands in the HBase shell:

scan "sbr_dev_db:links", {LIMIT => 200}
scan "sbr_dev_db:enterprise", {LIMIT => 200}

License

Copyright © 2017, Office for National Statistics (https://www.ons.gov.uk)

About

Loads extract files from IDBR into SBR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages