- extract
gurume category
from gurume.txt - extract
gurume list
from gurume.txt - transform
gurume list
to json format like below ?? a.k.agurume.json
- load
gurume.json
on elasticsearch
// gurume.json example (v0.0.1)
[
{
"category": [
{"name": "소고기"},
{"name": "숙성 고기집"}
],
"town": "서초동",
"station": [
{"name": "강남역"}
],
"name": "어사담",
"note": "드라이에이징"
},
]
- ETL pipeline
- design ES mapping
- ES cloudsetup
- add user dictionary (category, station, town)
- create api client role
- Backend ES client
- Frontend app
- AWS ECS setup
- jenkins build / deploy pipeline
## check exception case
go run main.go gurume.txt | grep -v 'info\|review\|hotel' | grep exception
## expcetion case update gurume.txt
## add 노포식당 handling
## WARN!! - ZERO WIDTH SPACE, must handle (U+200B)
## generate processed txt
## 1. build images (you can skip it when using local go env)
docker-compose build gurume
## 2. process gurume.txt -> gurume.processed.1.txt
### local env case
go run main.go formatData --file gurume.txt
### docker-compose env case
docker-compose run --rm gurume formatData --file gurume.txt
### local env case
go run main.go formatJSON
### docker-compose env case
docker-compose run --rm gurume formatJSON
## 3. check file
head -n2 data/gurume.processed.1.json | jq
{
"category": "평양냉면",
"station": "을지로 3가역",
"town": "입정동",
"name": "을지면옥"
}
{
"category": "평양냉면",
"station": "압구정역, 학동역",
"town": "논현동",
"name": "논현동 평양면옥"
}
## build es
docker-compose build elasticsearch2 elasticsearch3 elasticsearch
## cluster up
docker-compose up -d elasticsearch2 elasticsearch3 elasticsearch
## bulk request to ES
docker-compose run --rm gurume ingestES
## mapping check
curl localhost:9200/gurume_index/_mapping | jq
## search test
curl \
-H 'Content-Type: application/json'\
-X POST 'localhost:9200/gurume_index/gurume/_search'\
--data '{ "from": 0, "size": 30, "query" : { "match" : { "category.name" : "닭곰탕" } }}' | jq '.hits.hits[]._source.category'
curl \
-H 'Content-Type: application/json'\
-X POST 'localhost:9200/gurume_index/gurume/_search'\
--data '{ "from": 0, "size": 30, "query" : { "match" : { "station.name" : "을지로 4가역" } }}' | jq '.hits.hits[]._source.station'
- https://cloud.elastic.co
- create ES cluster, then update
.env
file accordingly
# .env example
GURUME_ENV=production
ES_CLUSTER_HOST=https://xxxxxxxxxxxxxx.ap-northeast-1.aws.found.io
ES_CLUSTER_PORT=9200
ES_CLUSTER_USER_ID=hoge
ES_CLUSTER_USER_PW=hoge
LOG_LEVEL=info
- ingest data to ES cluster
## bulk request to ES
docker-compose run --rm gurume ingestES
- API check in local env
### local env case
go run main.go api
### docker-compose env case
docker-compose run --rm gurume api
- Docker image build and push (manual)
## FYI, it will be automated by build pipeline
docker build -t pureugong/gurume:latest .
$(aws --profile pureugong-gurume ecr get-login --no-include-email)
docker tag pureugong/gurume:latest {aws-ecr-host}/{ecr-repo-name}:{version}
docker push {aws-ecr-host}/{ecr-repo-name}:{version}
- S3 bucket
- routing
- build pipeline
- autocomplete tag