Skip to content

Commit

Permalink
馃帀 Add MongoDB Source
Browse files Browse the repository at this point in the history
Signed-off-by: fut <fut.wrk@gmail.com>
  • Loading branch information
FUT committed Mar 8, 2021
1 parent e2bed6c commit b1061e3
Show file tree
Hide file tree
Showing 25 changed files with 1,283 additions and 1 deletion.
6 changes: 6 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
NEW_SOURCE_CHECKLIST.md
.ruby-gemset
.ruby-version
.byebug_history

tmp
16 changes: 16 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
FROM ruby:3.0-alpine

RUN apk update
RUN apk add --update build-base libffi-dev

WORKDIR /airbyte

COPY . ./

RUN gem install bundler
RUN bundle install

ENTRYPOINT ["ruby", "/airbyte/source.rb"]

LABEL io.airbyte.name=airbyte/source-mongodb
LABEL io.airbyte.version=0.1.4
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM mongo:4.0.23

COPY ./integration_tests /integration_tests

RUN echo "mongorestore --archive=integration_tests/dump/analytics.archive" > /docker-entrypoint-initdb.d/init.sh

LABEL io.airbyte.version=0.1.0
LABEL io.airbyte.name=airbyte/mongodb-integration-test-seed
8 changes: 8 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
source 'https://rubygems.org'

gem 'mongo'
gem 'slop'
gem 'dry-types'
gem 'dry-struct'

# gem 'byebug'
43 changes: 43 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
GEM
remote: https://rubygems.org/
specs:
bson (4.12.0)
concurrent-ruby (1.1.8)
dry-configurable (0.12.1)
concurrent-ruby (~> 1.0)
dry-core (~> 0.5, >= 0.5.0)
dry-container (0.7.2)
concurrent-ruby (~> 1.0)
dry-configurable (~> 0.1, >= 0.1.3)
dry-core (0.5.0)
concurrent-ruby (~> 1.0)
dry-inflector (0.2.0)
dry-logic (1.1.0)
concurrent-ruby (~> 1.0)
dry-core (~> 0.5, >= 0.5)
dry-struct (1.4.0)
dry-core (~> 0.5, >= 0.5)
dry-types (~> 1.5)
ice_nine (~> 0.11)
dry-types (1.5.1)
concurrent-ruby (~> 1.0)
dry-container (~> 0.3)
dry-core (~> 0.5, >= 0.5)
dry-inflector (~> 0.1, >= 0.1.2)
dry-logic (~> 1.0, >= 1.0.2)
ice_nine (0.11.2)
mongo (2.14.0)
bson (>= 4.8.2, < 5.0.0)
slop (4.8.2)

PLATFORMS
x86_64-linux

DEPENDENCIES
dry-struct
dry-types
mongo
slop

BUNDLED WITH
2.2.3
33 changes: 33 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Mongodb Source

This is the repository for the Mongodb source connector, written in Ruby.

## Local development
### Build
First, build the module by running the following from the airbyte project root directory:
```
cd airbyte-integrations/connectors/source-mongodb/
docker build . -t airbyte/source-mongodb:dev
```

### Integration Tests
From the airbyte project root, run:
```
./gradlew clean :airbyte-integrations:connectors:source-mongodb:integrationTest
```

## Configure credentials
### Configuring credentials as a community contributor
Required credentials are stored in `secrets` folder already. You can adjust them manually to run some advanced tests, but by default they should work as is.

## Discover phase
MongoDB does not have anything like table definition, thus we have to define column types from actual attributes and their values. Discover phase have two steps:

### Step 1. Find all unique properties
Connector runs the map-reduce command which returns all unique document props in the collection. Map-reduce approach should be sufficient even for large clusters.

### Step 2. Determine property types
For each property found, connector selects 10k documents from the collection where this property is not empty. If all the selected values have the same type - connector will set appropriate type to the property. In all other cases connector will fallback to `string` type.

## Author
This connector was authored by [Yury Koleda](https://github.com/FUT).
38 changes: 38 additions & 0 deletions airbyte-integrations/connectors/source-mongodb/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
plugins {
// Makes building the docker image a dependency of Gradle's "build" command. This way you could run your entire build inside a docker image
// via ./gradlew :airbyte-integrations:connectors:source-mongodb:build
id 'airbyte-docker'
id 'airbyte-standard-source-test-file'
}

airbyteStandardSourceTestFile {
def dbName = "mongo-airbyte-integration-test"
prehook {
project.exec {
def args = ["docker", "run", "--rm",
"--name", dbName,
"-d",
// assign to a weird port number so we don't conflict with any locally running mongo instances
"-p", "27888:27017",
"-e", "MONGO_INITDB_ROOT_USERNAME=user",
"-e", "MONGO_INITDB_ROOT_PASSWORD=password",
"airbyte/mongodb-integration-test-seed:dev"]
commandLine args
}
}

posthook {
project.exec {
commandLine "docker", "stop", dbName
}
}

// All these input paths must live inside this connector's directory (or subdirectories)
configPath = "secrets/valid_config.json"
configuredCatalogPath = "secrets/configured_catalog.json"
specPath = "lib/spec.json"
}

dependencies {
implementation files(project(':airbyte-integrations:bases:base-standard-source-test-file').airbyteDocker.outputs)
}
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
require_relative './airbyte_protocol.rb'

class AirbyteLogger
def self.format_log(text, log_level=Level::Info)
alm = AirbyteLogMessage.from_dynamic!({
'level' => log_level,
'message' => text
})

AirbyteMessage.from_dynamic!({
'type' => Type::Log,
'log' => alm.to_dynamic
}).to_json
end

def self.logger_formatter
proc { |severity, datetime, progname, msg|
format_log("[#{datetime}] #{severity} : #{progname} | #{msg.dump}\n\n")
}
end

def self.log(text, log_level=Level::Info)
message = format_log(text, log_level=Level::Info)

puts message
end
end
Loading

0 comments on commit b1061e3

Please sign in to comment.