Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,45 @@ Carbon is a tool for loading data into [Symplectic Elements](https://support.sym

Please refer to the [mitlib-tf-workloads-carbon](https://github.com/mitlibraries/mitlib-tf-workloads-carbon) for the deployment configuration.

For more information on the Carbon application, please refer to our [internal documentation on Confluence](https://mitlibraries.atlassian.net/l/cp/1E6cMvuT).

## Data flow

This flowchart depicts the data flow from MIT's Data Warehouse to the application's in-memory buffered streams and to the final location of the output file, an XML file on the Elements FTP server.

```mermaid
---
title:
---

flowchart TB
subgraph ext-source[Database]
mit-dwrhs[(MIT Data Warehouse)]
end

subgraph in-memory[Application In-memory]
direction TB
rec-generator([Query Results Generator])
subgraph piped[Piped Read-Write Buffer]
buffered-writer([Buffered Writer])
buffered-reader([Buffered Reader])
end
ftps-client((FTPS Client))
end

subgraph elements-ftp[Elements FTP server]
direction TB
xml-file([Feed XML file])
end

mit-dwrhs -->|Fetch query results | rec-generator
rec-generator-->|Yielding records one at a time, <br> transform record into normalized XML strings <br> and pass to write buffer| piped
buffered-writer -.->|Pipe contents to read buffer| buffered-reader
buffered-reader -->|Read buffer acts as data feed for an XML file on FTP server <br>| ftps-client
ftps-client -->|Stream contents from read buffer to an XML file on FTP server|xml-file
```


## Development

* To install with dev dependencies: `make install`
Expand Down