Project structure

app/ - layer of app-building logic
app/tools/geoimport - main package for importer of geodata.
app/services/geoapi - main package for web api
app/services/geoapi/handlers - input layer
business/ - layer of business logic
business/core/{city,country,location} - layer of access to business entities
business/core/{city,country,location}/db - layer of access to database entities (city, country, location)
business/data - helpers to manage data (migrations, seeds) and setup tests ()
foundation/ - all non-business related logic
foundation/database/ - common database related helpers
foundation/docker/ - common docker related helpers
foundation/web/ - common web related helpers
infra/ - configuration for infrastructure

Layers responsibilities (resps.)

                                Request  /  Response
---------------------------------------------------------------------------------------
                    Input/Output layer (app/services/geoapi/handlers)
                    
Resps.:   Decode request 
          Encode response
          Manage user-faced errors
---------------------------------------------------------------------------------------
                    Core layer (business/core/{city,country,location, importer})
                    
Resps.:   Validate input data 
          Prepare data to database layer
          Building result from database layer
          Handling errors from database layer 
---------------------------------------------------------------------------------------
                    Database layer (business/core/{city,country,location}/db)
                    
Resps.:   Communicate with mysql by queries
          Return entities from mysql to core layer
---------------------------------------------------------------------------------------

Import csv-data logic

As file is quite big I have 3 ideas how to implement this:

Read all data in a loop, put data in 1 huge transaction and commit it. This way requires a lot of memory (cause of huge transaction) and slow
Read data in a loop and pass them to workers (I used 8 workers as I have 8 logical cores laptop). Workers insert data without transactions. So no additional memory. However, requires additional time as indexes are being updated on every database modification. 1M modifications - 10M indexes updates (as I used 10 indexes)
Read data in a loop and pass them to workers. Start as a previous one but with important change: every worker works with defined set of country. Or to say another, all records for specific country will be handled by one specific worker. This allows to use multiple transactions. One transaction per worker. However, during implementing this I experienced an issues. I supposed it's a mysql driver bug. For some reasons, it holds transactions for 50s from time to time. So I end up with the solution #2. In real life, I'd deep into this issue and debug mysql driver's code. Or try to eliminate of this issue by changes in logic. For instance, I can firstly import countries and cities in one transaction. And import locations after this. This might save ~20min of execution time. Taking into account that there is no target time and this data will be imported one a day I think result time ~1.5h is ok. We could import it at 4am without any issues. Of course, in case of any additional requirements to time, pprof analyse of CPU usage will be done and code will be optimized for best performance.

Run

make all
make import
make api

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
app		app
business		business
foundation		foundation
infra		infra
vendor		vendor
Makefile		Makefile
Readme.md		Readme.md
data_dump.csv		data_dump.csv
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project structure

Layers responsibilities (resps.)

Import csv-data logic

Run

About

Releases

Packages

Languages

mchusovlianov/geodata

Folders and files

Latest commit

History

Repository files navigation

Project structure

Layers responsibilities (resps.)

Import csv-data logic

Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages