Skip to content

Commit

Permalink
WIP: API that can filter and transform data before forwading on to Se…
Browse files Browse the repository at this point in the history
…nzing.
  • Loading branch information
jamesiarmes committed Mar 26, 2024
1 parent 9157a89 commit 26c2f4d
Show file tree
Hide file tree
Showing 13 changed files with 171 additions and 19 deletions.
56 changes: 40 additions & 16 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -4,59 +4,81 @@ PATH
cmr-entity-resolution (0.1.0)
faraday (~> 2.7)
file_exists (~> 0.2)
ibm_db (~> 5.4)
grape (~> 2.0)
iteraptor (~> 0.10)
mongo (~> 2.18)
rack (~> 3.0)
rackup (~> 2.1)
sequel (~> 5.68)
thor (~> 1.2)
yajl-ruby (~> 1.4)

GEM
remote: https://rubygems.org/
specs:
activemodel (7.0.7.2)
activesupport (= 7.0.7.2)
activerecord (7.0.7.2)
activemodel (= 7.0.7.2)
activesupport (= 7.0.7.2)
activesupport (7.0.7.2)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 1.6, < 2)
minitest (>= 5.1)
tzinfo (~> 2.0)
addressable (2.8.4)
public_suffix (>= 2.0.2, < 6.0)
ast (2.4.2)
bigdecimal (3.1.7)
bson (4.15.0)
builder (3.2.4)
concurrent-ruby (1.2.2)
diff-lcs (1.5.0)
docile (1.4.0)
down (5.4.1)
addressable (~> 2.8)
dry-core (1.0.1)
concurrent-ruby (~> 1.0)
zeitwerk (~> 2.6)
dry-inflector (1.0.0)
dry-logic (1.5.0)
concurrent-ruby (~> 1.0)
dry-core (~> 1.0, < 2)
zeitwerk (~> 2.6)
dry-types (1.7.2)
bigdecimal (~> 3.0)
concurrent-ruby (~> 1.0)
dry-core (~> 1.0)
dry-inflector (~> 1.0)
dry-logic (~> 1.4)
zeitwerk (~> 2.6)
factory_bot (6.2.1)
activesupport (>= 5.0.0)
faraday (2.7.6)
faraday-net_http (>= 2.0, < 3.1)
ruby2_keywords (>= 0.0.4)
faraday-net_http (3.0.2)
file_exists (0.2.0)
grape (2.0.0)
activesupport (>= 5)
builder
dry-types (>= 1.1)
mustermann-grape (~> 1.0.0)
rack (>= 1.3.0)
rack-accept
i18n (1.14.1)
concurrent-ruby (~> 1.0)
ibm_db (5.4.1)
activerecord (< 7.1)
down
zip
iteraptor (0.10.0)
json (2.6.3)
minitest (5.19.0)
mongo (2.18.2)
bson (>= 4.14.1, < 5.0.0)
mustermann (3.0.0)
ruby2_keywords (~> 0.0.1)
mustermann-grape (1.0.2)
mustermann (>= 1.0.0)
parallel (1.23.0)
parser (3.2.2.3)
ast (~> 2.4.1)
racc
public_suffix (5.0.1)
racc (1.7.0)
rack (3.0.10)
rack-accept (0.4.5)
rack (>= 0.4)
rackup (2.1.0)
rack (>= 3)
webrick (~> 1.8)
rainbow (3.1.1)
rake (13.0.6)
regexp_parser (2.8.0)
Expand Down Expand Up @@ -111,10 +133,12 @@ GEM
tzinfo (2.0.6)
concurrent-ruby (~> 1.0)
unicode-display_width (2.4.2)
webrick (1.8.1)
yajl-ruby (1.4.3)
zip (2.0.2)
zeitwerk (2.6.13)

PLATFORMS
arm64-darwin-23
x86_64-darwin-20
x86_64-darwin-22
x86_64-linux
Expand Down
3 changes: 3 additions & 0 deletions cmr-entity-resolution.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,12 @@ Gem::Specification.new do |s|
# Add runtime dependencies.
s.add_runtime_dependency 'faraday', '~> 2.7'
s.add_runtime_dependency 'ibm_db', '~> 5.4'
s.add_runtime_dependency 'grape', '~> 2.0'

Check notice on line 32 in cmr-entity-resolution.gemspec

View workflow job for this annotation

GitHub Actions / RuboCop Results

cmr-entity-resolution.gemspec#L32

Dependencies should be sorted in an alphabetical order within their section of the gemspec. Dependency `grape` should appear before `ibm_db`. [Gemspec/OrderedDependencies]
s.add_runtime_dependency 'iteraptor', '~> 0.10'
s.add_runtime_dependency 'mongo', '~> 2.18'
s.add_runtime_dependency 'sequel', '~> 5.68'
s.add_runtime_dependency 'rack', '~> 3.0'

Check notice on line 36 in cmr-entity-resolution.gemspec

View workflow job for this annotation

GitHub Actions / RuboCop Results

cmr-entity-resolution.gemspec#L36

Dependencies should be sorted in an alphabetical order within their section of the gemspec. Dependency `rack` should appear before `sequel`. [Gemspec/OrderedDependencies]
s.add_runtime_dependency 'rackup', '~> 2.1'
s.add_runtime_dependency 'thor', '~> 1.2'
s.add_runtime_dependency 'yajl-ruby', '~> 1.4'

Expand Down
5 changes: 5 additions & 0 deletions config.ru
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
require_relative 'lib/api'

Check notice on line 1 in config.ru

View workflow job for this annotation

GitHub Actions / RuboCop Results

config.ru#L1

Missing frozen string literal comment. [Style/FrozenStringLiteralComment]

use Rack::RewindableInput::Middleware

run API
22 changes: 22 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ services:

tools:
build: .
platform: linux/amd64
environment:
SENZING_ENGINE_CONFIGURATION_JSON: >-
{
Expand All @@ -141,6 +142,7 @@ services:
depends_on:
- api
build: .
platform: linux/amd64
environment:
SENZING_ENGINE_CONFIGURATION_JSON: >-
{
Expand Down Expand Up @@ -168,6 +170,7 @@ services:
depends_on:
- api
build: .
platform: linux/amd64
environment:
SENZING_ENGINE_CONFIGURATION_JSON: >-
{
Expand All @@ -189,6 +192,23 @@ services:
- ${EXPORTER_CONFIG_FILE:-./config/config.yml}:/etc/cmr/config.yml
- ./data/export:/etc/cmr/export

cmr-api:
profiles:
- cmr-api
depends_on:
- api
build: .
platform: linux/amd64
environment:
CMR_CONFIG_FILE: /etc/cmr/config.yml
networks:
- senzing
ports:
- ${CMR_API_PORT:-3000}:3000
command: api
volumes:
- ${CMR_CONFIG_FILE:-./config/config.yml}:/etc/cmr/config.yml

webapp-console:
profiles:
- webapp
Expand All @@ -208,6 +228,7 @@ services:
}
}
image: senzing/entity-search-web-app-console:${SENZING_DOCKER_IMAGE_VERSION_ENTITY_SEARCH_WEB_APP_CONSOLE:-latest}
platform: linux/amd64
networks:
- senzing
user: "${SENZING_UID:-1001}:${SENZING_GID:-1001}"
Expand All @@ -231,6 +252,7 @@ services:
SENZING_WEB_SERVER_PORT: 8251
SENZING_WEB_SERVER_STREAM_CLIENT_URL: wss://api:8250/ws
image: senzing/entity-search-web-app:${SENZING_DOCKER_IMAGE_VERSION_ENTITY_SEARCH_WEB_APP:-latest}
platform: linux/amd64
networks:
- senzing
ports:
Expand Down
46 changes: 46 additions & 0 deletions lib/api.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# frozen_string_literal: true

require 'grape'

require_relative 'config'
require_relative 'import'

# Provides a proxy api for entity resolution.
class API < Grape::API
format :json

resource :health do
get do
{ status: 'ok' }
end
end

resource :import do
desc 'Import a single record into Senzing.'
params do
requires :source, type: Symbol, desc: 'The name of the data source.'
end
post do
# TODO: Load and modify the config once at startup.
config = Config.from_file(ENV.fetch('CMR_CONFIG_FILE', 'config/config.yml'))

# TODO: Return a 404 if the source is not found.
source = config.sources[params[:source]]
if source.nil?
status 404
return "Source \"#{params[:source]}\" not found."
end

# Override configured sources as we only want to use this one.
source[:type] = 'API::JSON'
source[:payload] = params
config.sources = { params[:source] => source }

# TODO: Pass the source name here as an optional argument.
Import.new(config).import

# TODO: Return something better. Make it JSONy.
'Record imported.'
end
end
end
3 changes: 2 additions & 1 deletion lib/import.rb
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ def senzing
# @yield
# @yieldparam source [Source::Base] A source object for data imports.
def each_source
@sources ||= @config.sources.each do |source|
@sources ||= @config.sources.each do |name, source|
source[:name] ||= name
yield Source.from_config(source)
end
end
Expand Down
1 change: 1 addition & 0 deletions lib/source.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# frozen_string_literal: true

require_relative 'source/api'
require_relative 'source/csv'
require_relative 'source/informix'

Expand Down
1 change: 1 addition & 0 deletions lib/source/api.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
require_relative 'api/json'

Check notice on line 1 in lib/source/api.rb

View workflow job for this annotation

GitHub Actions / RuboCop Results

lib/source/api.rb#L1

Missing frozen string literal comment. [Style/FrozenStringLiteralComment]
24 changes: 24 additions & 0 deletions lib/source/api/base.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# frozen_string_literal: true

require_relative '../base'

module Source
module API
# Base class for API data sources.
class Base < Source::Base
def each
records = @source_config[:payload].is_a?(Array) ? @source_config[:payload] : [@source_config[:payload]]
records.each do |row|
row.transform_keys! { |key| field_mapper(key) }
yield row
end
end

private

def default_name
"API::#{super}"
end
end
end
end
13 changes: 13 additions & 0 deletions lib/source/api/json.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# frozen_string_literal: true

require_relative 'base'

module Source
module API
# Import JSON records from the API.
# TODO: Do we actually need this?
class JSON < Base

end

Check notice on line 11 in lib/source/api/json.rb

View workflow job for this annotation

GitHub Actions / RuboCop Results

lib/source/api/json.rb#L10-L11

Extra empty line detected at class body beginning. [Layout/EmptyLinesAroundClassBody]
end
end
9 changes: 8 additions & 1 deletion lib/source/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def each
#
# @return [String]
def name
@source_config[:name] || self.class.name.split('::').last
@source_config[:name] || default_name
end

private
Expand All @@ -54,5 +54,12 @@ def field_mapper(field)
def defaults
{ field_map: {} }
end

# Default name for the current source.
#
# @return [String]
def default_name
self.class.name.split('::').last
end
end
end
2 changes: 1 addition & 1 deletion lib/transformation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ class InvalidTransform < RuntimeError; end
# @param transformations [Array<Hash>] Array of transformation configurations.
# @return [Hash] The resulting record after all transformations have been applied.
def self.transform(config, record, transformations)
result = transformations.any? do |transformation|
result = transformations&.any? do |transformation|
transform_from_config(transformation).transform(record)
end

Expand Down
5 changes: 5 additions & 0 deletions scripts/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ BASE_PATH=$( dirname -- "$0" )
COMMAND="$1"

case $COMMAND in
api)
echo "Starting Clear My Record Entity Resolution API..."
cd /opt/cmr
bundle exec rackup --host 0.0.0.0 --port 3000
;;
load)
export CONFIG_FILE="/etc/cmr/config.yml"
"$BASE_PATH/load.sh"
Expand Down

0 comments on commit 26c2f4d

Please sign in to comment.