New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to select Postgres DB database? #16
Comments
At the moment the database name is hard-coded in the
Set the |
@ept Martin, thanks! Now I've got it! But when I try to connect bottledwater to the different database I see this error:
Modified connection string with changed db link and dbname
I checked if I have extension in DB:
I appreciate any help. Thanks! |
I checked for replication stream and it exists in DB. Still the same error:
|
Looks like I have replication slots with the same name in both DB's :postgres and toshi_development. How can I drop one? |
Ok, I drop the replication slot
and now I have
|
Oh.. I've fixed everything with Postgres but now kafka doesn't accept it
|
Increased swap to 32Gb and passed this valley of death. But estimation of RAM I need is still actual. Next problem is with avro. Any ideas how to debug?
|
Sorry for the spam. I just like your product and would like to use it in my project 😃 |
Hi @igorbarinov, just catching up on this. Good work debugging your way up to this point :) For the HTTP 500 response from the schema registry, there should be something in the schema registry log (perhaps that's a bug in the schema registry). But it looks like it was transient. The memory use is a bit worrying — Bottled Water is supposed to only use a small amount of memory, even on a large database. So perhaps there's a memory leak somewhere. If you have time to look into it, I'd appreciate your contribution, otherwise I'll look when I get the time. Finally, the "row conversion failed: Field index 2 out of range" error. It's interesting that the error occurs in the logical decoding plugin, but not while taking the snapshot. Do you know in which table you modified data after taking the snapshot, i.e. the table whose row it's trying to decode here? Could you give me the schema of that table? My guess is that your database has some edge case in the table's tupledescriptor, which BW isn't handling correctly. |
Martin, thanks! In standby mode when I disconnect it from the network I don't have the "row conversion failed: Field index 2 out of range" error. But after about ten minutes, I've got new error (on top schema-registry log, bottom left - bottledwater log, bottom right - kafka log Is it possible to consume only specified table not the complete DB scheme of the database? |
Hm, not obvious what's going wrong here. I made a patch to log detailed debugging info on row conversion failure: see #17. Would you mind building with that patch, re-running and giving me the output? |
To answer your question, it's currently not possible to filter by table. That feature could be added in principle, but hasn't been high priority thus far. |
@igorbarinov I've rebuilt the docker image with the debugging output, so could you try again with the latest image? It also fixes a memory leak (#20), so hopefully you should now be able to take a snapshot without using excessive memory. |
Nice, I will try it today!
|
I am having the same problem with the avro row conversion failure. Here is the output:
After posting this I see that it is trying to set the primary key (record_id) to be nullable and not have a definition. I don't know why it would be doing this, because when I select all the rows, record_id has a value for every row. This only seems to happen when I import a lot of data into the table (I am using an SSIS task to import data from SQL Server to this Postgres database), I tried manually inserting data into the same table schema on a different VM and this error does not come up. Also, I get the same memory error when running the SSIS task and bottled water at the same time:
But if I wait until the SSIS task is finished and then start bottledwater, then I get the avro row conversion failure. Let me know if you have any thoughts on this. |
I decided to reset everything (so I rm'ed the bottledwater container, removed the replication slot from the DB, and dropped and recreated the table) and the avro row conversion failure seemed to go away. I think it was due to the fact that I had dropped and recreated the table without dropping the replication slot, which meant that bottled water was trying to track the fact that I dropped the table. So that error seems to be my bad. I believe that error was also cascading and causing the other one too...so this may have all been due to the fact that bottledwater does not like to track tables that had data and were either dropped or had all of the data removed. |
On further inspection, the avro row conversion error still comes up when I am running the SSIS transfer task and bottledwater at the same time. But if I run SSIS to completion, then start bottledwater, it syncs the changes perfectly. So it seems like bottledwater has a hard time syncing changes while data is simultaneously being added to the table quickly. |
How to select or escape one schema of a Postgres DB database? it seems bottlewater will scan all schema and check whether there is a primary key for each table |
I am going to close this issue, since several different problems have got jumbled up in the description, and several of them are now fixed:
|
Hi!
I am trying to set up bottledwater-pg with my existing docker containers for data ingestion from Postgres to Kafka.
How to set up database to use when I run:
docker run -d --name bottledwater --hostname bottledwater --link postgres:postgres
--link kafka:kafka --link schema-registry:schema-registry confluent/bottledwater:0.1
Now it uses postgres db and I need to use other.
Thanks!
The text was updated successfully, but these errors were encountered: