Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monstache stalling while indexing large collection holding 700K documents #509

Open
jayminkapish opened this issue May 11, 2021 · 8 comments

Comments

@jayminkapish
Copy link

Monstache has been working really well for us in the staging environment past couple of weeks. It gave us a lot of excitement syncing 23K documents from staging database to staging elasticsearch cluster very quickly (< 10m). We then moved the deployment to production 3 days ago with the same configuration toml file as staging except the production collection size is very big. COLLECTION SIZE: 4.97GB and TOTAL DOCUMENTS: 713458

We are looking to sync entire mongo collection onto the elasticsearch cluster and then tail the oplog.

verbose = false

enable-http-server = true

namespace-regex = "^db.collection$"

resume = true

stats = true

stats-duration = "30s"

debug = false

elasticsearch-retry = true

elasticsearch-max-docs = 50

elasticsearch-client-timeout = 60000

[[mapping]]
namespace = "db.collection"
index = "db"

[[script]]
namespace = "db.collection"
script = """
module.exports = function(doc, ns, updateDesc) {
    // the doc namespace e.g. test.test is passed as the 2nd arg
    // if available, an object containing the update description is passed as the 3rd arg
    return _.omit(doc, "raw", "name", "email", "_uid");
}
"""

And we have the following env vars:

{
  "monstache-direct-read-ns": "db.collection",
  "monstache-es-urls": "es_connection_url",
  "monstache-mongo-url": "mongo_connection_url"
}

Monstache kicked off sync at higher rate but after just a few hours it seems to be stalling and only indexing 2-5 documents a minute. It logged Direct reads completed after about 6h.

and stats timer logs stats

STATS 2021/05/11 01:59:52 {"Flushed":53745,"Committed":10426,"Indexed":230959,"Created":0,"Updated":0,"Deleted":1,"Succeeded":229042,"Failed":1918,"Workers":[{"Queued":0,"LastDuration":52000000},{"Queued":0,"LastDuration":8000000},{"Queued":0,"LastDuration":7000000},{"Queued":0,"LastDuration":12000000}]}

Monstache also logged the following error about 800 times in the first few hours of production sync kick off:

ERROR 2021/05/08 03:54:28 Bulk response item: {"_index":"attribute","_type":"_doc","_id":"5fd98c7287761f000163fe59","status":500,"error":{"type":"mapper_exception","reason":"timed out while waiting for a dynamic mapping update"}}

We've allocated 2 CPUs and 4gb memory to the monstache and it is hardly using 2% of it at the moment.

Can you tell us looking at the config what can we do to speed up the sync?

Thanks in advance.

@rwynn
Copy link
Owner

rwynn commented May 12, 2021

Hi @jayminkapish what version of Elasticsearch do you have in production? I wonder from that error message if you might be running into a problem like the one described at elastic/elasticsearch#50670.

You may want to compare the results of a call to /index/_mapping in staging and production to see if the particular data in production is causing many dynamic mapping updates.

This page may also be helpful https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html

@jayminkapish
Copy link
Author

I should've provided these earlier:

INFO 2021/05/07 21:50:52 Started monstache version 6.7.5
INFO 2021/05/07 21:50:52 Go version go1.15.5
INFO 2021/05/07 21:50:52 MongoDB go driver v1.5.1
INFO 2021/05/07 21:50:52 Elasticsearch go driver 7.0.23
INFO 2021/05/07 21:50:52 Successfully connected to MongoDB version 4.4.5
INFO 2021/05/07 21:50:52 Successfully connected to Elasticsearch version 7.10.2
INFO 2021/05/07 21:50:52 Listening for events
INFO 2021/05/07 21:50:52 Watching changes on the deployment

Thanks for the pointers and will look at them. I am assuming you found our toml config just fine to sync entire mongo collection onto the elasticsearch cluster.

@jayminkapish
Copy link
Author

Yes production data size is pretty big compared to staging and there are many many unique fields (the way mongo collection is designed). This must be adding to the dynamic mapping timeouts. We had similar issue with mongo-connector but adjusting bulk size got us to the finish line.

We are going to try limiting the bulk size via elasticsearch-max-bytes. We may try 2MB (in bytes) and see if we're running into dynamic mapping timeouts.

@jayminkapish
Copy link
Author

I think we are just looking for ways to slow down monstache for the initial sync.

@sohel2020
Copy link

@jayminkapish I'm having the same issue. looks like it's taking forever to sync the data. @rwynn any opinion?

stats

{
  "Flushed": 280,
  "Committed": 380,
  "Indexed": 403,
  "Created": 0,
  "Updated": 0,
  "Deleted": 0,
  "Succeeded": 403,
  "Failed": 0,
  "Workers": [
    {
      "Queued": 0,
      "LastDuration": 7000000
    },
    {
      "Queued": 0,
      "LastDuration": 5000000
    },
    {
      "Queued": 0,
      "LastDuration": 6000000
    },
    {
      "Queued": 0,
      "LastDuration": 6000000
    },
    {
      "Queued": 0,
      "LastDuration": 5000000
    },
    {
      "Queued": 0,
      "LastDuration": 5000000
    },
    {
      "Queued": 0,
      "LastDuration": 5000000
    },
    {
      "Queued": 0,
      "LastDuration": 7000000
    },
    {
      "Queued": 0,
      "LastDuration": 5000000
    },
    {
      "Queued": 0,
      "LastDuration": 4000000
    }
  ]
}
INFO 2021/06/18 11:28:27 Started monstache version 6.7.5
INFO 2021/06/18 11:28:27 Go version go1.15.5
INFO 2021/06/18 11:28:27 MongoDB go driver v1.5.1
INFO 2021/06/18 11:28:27 Elasticsearch go driver 7.0.23
INFO 2021/06/18 11:28:27 Successfully connected to MongoDB version 4.4.2
INFO 2021/06/18 11:28:27 Successfully connected to Elasticsearch version 7.13.1
INFO 2021/06/18 11:28:27 Listening for events
INFO 2021/06/18 11:28:27 Sending systemd READY=1
WARN 2021/06/18 11:28:27 Systemd notification not supported (i.e. NOTIFY_SOCKET is unset)
INFO 2021/06/18 11:28:27 Starting http server at :8080
INFO 2021/06/18 11:28:27 Watching changes on the deployment
INFO 2021/06/18 11:28:27 Resuming stream '' from collection monstache.tokens using resume name 'default'
INFO 2021/06/18 11:28:27 Direct reads completed

Config:

gzip = true
stats = true
index-stats = true
replay = false
resume = true
resume-strategy = 1
index-files = false
verbose = true
direct-read-split-max= 20
exit-after-direct-reads = false
enable-http-server = true
elasticsearch-max-conns = 10

**other's env**

MONSTACHE_MONGO_URL: "mongodb://10.10.5.37:5557"
MONSTACHE_ES_URLS: "http://my-es-http:9200"
MONSTACHE_ES_USER: "elastic"
MONSTACHE_ES_PASS: "0UDl04649LxkK0GE9"
MONSTACHE_DIRECT_READ_NS: "mydb.admin,mydb.area,mydb.banner,mydb.brand,mydb.campaing,mydb.category,mydb.city,mydb.deliveryCharge,mydb.deliveryTime,mydb.division,mydb.product,mydb.region,mydb.reward,mydb.shop,mydb.slider,mydb.sliderItem,mydb.user,mydb.version"

@jayminkapish
Copy link
Author

We've paused sync since my last comment. We're hoping to resume this work in July.

@asmaaelk
Copy link

@rwynn any updates on this?

@rwynn
Copy link
Owner

rwynn commented Aug 20, 2021

@asmaaelk can you describe the error or behavior you are seeing?
Are you also receiving errors like timed out while waiting for a dynamic mapping update?
If so it is best to map your data explicitly using index templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants