Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spike: persistence/indexing into elasticsearch (and maybe sqlite) #4

Closed
adamdecaf opened this issue Jan 4, 2019 · 4 comments
Closed
Assignees

Comments

@adamdecaf
Copy link
Member

adamdecaf commented Jan 4, 2019

A standalone instance of this app needs to download the OFAC files on its own and should refresh that copy after N hours. (N is configurable and likely defaults to 24h) This allows someone to start our app without the need for external dependencies and keeps the information up to date.

We can keep the files in temp storage close to the app. When the app restarts it can check the modification time of the files and if the files are too old download them again. This would help to prevent repeated downloads if the app is in a crash loop.

After reading the flat files we might want to persist the structured data in a database to allow for better queries, full text, etc. I think a SQL solution would be the best and we can start with sqlite since our other apps use that.

The spec for CSV files isn't too bad and can probably be directly mapped to a few tables. ent_num is used to join the tables together.

FORMAT SDN CSV

Main table, text file name SDN.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        ent_num     number          unique record
                                     identifier/unique
                                     listing identifier
2        SDN_Name     text     350   name of SDN
3        SDN_Type     text     12    type of SDN
4        Program      text     50    sanctions program name
5        Title        text     200   title of an individual
6        Call_Sign    text     8     vessel call sign
7        Vess_type    text     25    vessel type
8        Tonnage      text     14    vessel tonnage
9        GRT          text     8     gross registered tonnage
10       Vess_flag    text     40    vessel flag
11       Vess_owner   text     150   vessel owner
12       Remarks      text     1000  remarks on SDN*

Address table, text file name ADD.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        Ent_num      number         link to unique listing
2        Add_num      number         unique record identifier
3        Address      text     750   street address of SDN
4        City/				text     116   city, state/province, zip/postal code
         State/Province/
         Postal Code
5        Country      text     250   country of address
6        Add_remarks  text     200   remarks on address

Alternate identity table, text file name ALT.CSV

Column
sequence Column name  Type     Size  Description
-------- ------------ -------  ----  ---------------------
1        ent_num      number         link to unique listing
2        alt_num      number         unique record identifier
3        alt_type     text     8     type of alternate identity
                                     (aka, fka, nka)
4        alt_name     text     350   alternate identity name
5        alt_remarks  text     200   remarks on alternate identity
@adamdecaf
Copy link
Member Author

A standalone instance of this app needs to download the OFAC files on its own

One other reason for this is to have the minimum steps needed for local dev. Having anyone be able to go run our app (or docker run) is really powerful.

For local dev it'd be nice to just go run a 4th app and have all our services. http://docs.moov.io/en/latest/tutorials/local-dev/ (I have longer term plans to better automate 4+ local Go apps.)

@adamdecaf adamdecaf self-assigned this Jan 18, 2019
@adamdecaf adamdecaf changed the title initial storage / persistence / database spike: persistence into elasticsearch (and maybe sqlite) Jan 18, 2019
@adamdecaf
Copy link
Member Author

adamdecaf commented Jan 18, 2019

Changed the title. Let's store the OFAC records in elasticsearch (ES) to get something going. If we need to store the watches let's use sqlite - I'm not thinking of ES as durable storage right now.

@adamdecaf adamdecaf changed the title spike: persistence into elasticsearch (and maybe sqlite) spike: persistence/indexing into elasticsearch (and maybe sqlite) Jan 18, 2019
@adamdecaf
Copy link
Member Author

I can take on deploying ES if no one else wants to, but I'd like to get people familiar with Kubernetes.

@adamdecaf adamdecaf removed their assignment Jan 18, 2019
@adamdecaf adamdecaf self-assigned this Jan 23, 2019
adamdecaf added a commit to adamdecaf/watchman that referenced this issue Jan 23, 2019
adamdecaf added a commit to adamdecaf/watchman that referenced this issue Jan 23, 2019
adamdecaf added a commit to adamdecaf/watchman that referenced this issue Jan 23, 2019
@adamdecaf
Copy link
Member Author

We won't need ES storage for this, so that simplifies the app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant