Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow aerospike sink to handle multiple tables #2393

Merged
merged 2 commits into from Feb 19, 2024

Conversation

chubei
Copy link
Collaborator

@chubei chubei commented Feb 12, 2024

This PR makes it possible for one sink to handle multiple tables, thus making it possible to keep operation order.

@prabhuvaibhav I haven't tested it yet. Need your help to test this PR.

@duonganhthu43 the UI must change accordingly. I've made updates to contract.proto. Need your help with frontend updates.

@chubei chubei force-pushed the feat/aerospike_consistency branch 2 times, most recently from 20c85d6 to f715ce2 Compare February 12, 2024 14:40
@prabhuvaibhav
Copy link
Contributor

prabhuvaibhav commented Feb 16, 2024

I successfully tested the PR for INSERT, DELETE and UPDATE queries. I set up Oracle DB on Amazon RDS and everything else was set up on my local machine.

The config used for testing:

version: 1
app_name: consistency-test
connections:
  - config: !Oracle
      user: DOZER
      password: 123
      host: oracle-db-consistency-test.cxtwfj9nkwtu.ap-southeast-1.rds.amazonaws.com
      port: 1521
      sid: ORCL
      schemas: 
        - VAIBHAV
      replicator: !LogMiner
        poll_interval_in_milliseconds: 1000
    name: oracle
  
  - config: !Aerospike
      namespace: test
      sets: 
        - customers
        - transactions
      hosts: localhost:3000
    name: aerospike

sources:
  - name: customers
    table_name: CUSTOMERS
    schema: VAIBHAV
    connection: oracle
    columns:
      - CUSTOMER_ID
      - FIRST_NAME
      - LAST_NAME
      - CITY

  - name: transactions
    table_name: TRANSACTIONS
    schema: VAIBHAV
    connection: oracle
    columns:
      - TRANSACTION_ID
      - CUSTOMER_ID
      - TYPE
      - STATUS

sinks:
  - name: transactions
    config: !Aerospike
      connection: aerospike
      n_threads: 1
      tables: 
        - namespace: test
          set_name: customers
          source_table_name: customers
        - namespace: test
          set_name: transactions
          source_table_name: transactions
          denormalize:
            - from_namespace: test
              from_set: customers
              key: CUSTOMER_ID
              columns:
                - CITY

Initially, the tables CUSTOMERS and TRANSACTIONS were populated with 1000 and 50000 records respectively.
After snapshotting was successful, replication was tested using SQL queries on the source.
INSERT queries tested using:

INSERT INTO customers (CUSTOMER_ID, FIRST_NAME, LAST_NAME, DOB, EMAIL, PHONE_NUMBER, ADDRESS, CITY, STATE, ZIP_CODE, COUNTRY) 
SELECT 10000, 'John', 'Doe', TO_DATE('1990-01-01', 'YYYY-MM-DD'), 'john@example.com', '1234567890', '123 Main St', 'Anytown', 'CA', '12345', 'USA' FROM DUAL UNION ALL
SELECT 10001, 'Jane', 'Smith', TO_DATE('1985-05-15', 'YYYY-MM-DD'), 'jane@example.com', '9876543210', '456 Elm St', 'Othertown', 'NY', '54321', 'USA' FROM DUAL UNION ALL
SELECT 10002, 'Michael', 'Johnson', TO_DATE('1982-08-20', 'YYYY-MM-DD'), 'michael@example.com', '1112223333', '789 Oak St', 'Anycity', 'TX', '67890', 'USA' FROM DUAL UNION ALL
SELECT 10003, 'Emily', 'Brown', TO_DATE('1995-12-10', 'YYYY-MM-DD'), 'emily@example.com', '5556667777', '321 Pine St', 'Sometown', 'FL', '54321', 'USA' FROM DUAL UNION ALL
SELECT 10004, 'David', 'Williams', TO_DATE('1980-07-25', 'YYYY-MM-DD'), 'david@example.com', '9998887777', '567 Cedar St', 'Yourtown', 'IL', '98765', 'USA' FROM DUAL UNION ALL
SELECT 10005, 'Sarah', 'Miller', TO_DATE('1988-03-05', 'YYYY-MM-DD'), 'sarah@example.com', '2223334444', '789 Maple St', 'Hometown', 'OH', '23456', 'USA' FROM DUAL UNION ALL
SELECT 10006, 'Christopher', 'Wilson', TO_DATE('1983-09-18', 'YYYY-MM-DD'), 'chris@example.com', '7778889999', '654 Oak St', 'Theirtown', 'GA', '34567', 'USA' FROM DUAL UNION ALL
SELECT 10007, 'Jessica', 'Martinez', TO_DATE('1975-06-30', 'YYYY-MM-DD'), 'jessica@example.com', '4445556666', '987 Elm St', 'Heretown', 'PA', '45678', 'USA' FROM DUAL UNION ALL
SELECT 10008, 'Daniel', 'Taylor', TO_DATE('1992-11-20', 'YYYY-MM-DD'), 'daniel@example.com', '6667778888', '1010 Oak St', 'Somewhere', 'WA', '56789', 'USA' FROM DUAL UNION ALL
SELECT 10009, 'Amanda', 'Anderson', TO_DATE('1979-04-15', 'YYYY-MM-DD'), 'amanda@example.com', '8889990000', '246 Cedar St', 'Everytown', 'TX', '67890', 'USA' FROM DUAL;

INSERT INTO transactions (TRANSACTION_ID, CUSTOMER_ID, TYPE, AMOUNT, CURRENCY, TRANSACTION_DATE, STATUS, DESCRIPTION) 
SELECT 100000, 10000, 'Transfer', 100.00, 'USD', TO_DATE('2024-02-15', 'YYYY-MM-DD'), 'Completed', 'Transfer from savings' FROM DUAL UNION ALL
SELECT 100001, 10001, 'Deposit', 50.00, 'USD', TO_DATE('2024-02-14', 'YYYY-MM-DD'), 'Completed', 'Deposit to checking account' FROM DUAL UNION ALL
SELECT 100002, 10002, 'Withdrawal', 75.00, 'EUR', TO_DATE('2024-02-14', 'YYYY-MM-DD'), 'Pending', 'Withdrawal from ATM' FROM DUAL UNION ALL
SELECT 100003, 10003, 'Transfer', 200.00, 'USD', TO_DATE('2024-02-13', 'YYYY-MM-DD'), 'Completed', 'Transfer to investment account' FROM DUAL UNION ALL
SELECT 100004, 10004, 'Deposit', 150.00, 'USD', TO_DATE('2024-02-13', 'YYYY-MM-DD'), 'Completed', 'Paycheck deposit' FROM DUAL UNION ALL
SELECT 100005, 10005, 'Withdrawal', 50.00, 'EUR', TO_DATE('2024-02-12', 'YYYY-MM-DD'), 'Completed', 'Cash withdrawal' FROM DUAL UNION ALL
SELECT 100006, 10006, 'Transfer', 300.00, 'USD', TO_DATE('2024-02-12', 'YYYY-MM-DD'), 'Pending', 'Transfer to friend' FROM DUAL UNION ALL
SELECT 100007, 10007, 'Deposit', 200.00, 'USD', TO_DATE('2024-02-11', 'YYYY-MM-DD'), 'Completed', 'Bonus deposit' FROM DUAL UNION ALL
SELECT 100008, 10008, 'Withdrawal', 100.00, 'EUR', TO_DATE('2024-02-11', 'YYYY-MM-DD'), 'Completed', 'Online purchase' FROM DUAL UNION ALL
SELECT 100009, 10009, 'Transfer', 400.00, 'USD', TO_DATE('2024-02-10', 'YYYY-MM-DD'), 'Pending', 'Transfer to savings' FROM DUAL;
COMMIT;

The changes were replicated to the sink successfully.

UPDATE queries were tested using:

UPDATE CUSTOMERS
SET CITY = 'New York'
WHERE CUSTOMER_ID = 10000;
COMMIT;

The above query was failing initially, but the bug was found in the parser's regex and was fixed in #2401.
After the fix, the changes were being replicated correctly.

DELETE queries were tested using:

DELETE FROM CUSTOMERS
WHERE CUSTOMER_ID >= 10000;
COMMIT;

The above query was failing initially as well, but the bug was found in another regex, and was fixed in #2400.
Successful after the fix.

A transaction with all three types combined was tested as well.

-- The order of queries must be preserved, or else the new transaction would have the wrong city
INSERT INTO CUSTOMERS VALUES(10000, 'John', 'Doe', TO_DATE('1990-01-01', 'YYYY-MM-DD'), 'john@example.com', '1234567890', '123 Main St', 'Anytown', 'CA', '12345', 'USA');
INSERT INTO TRANSACTIONS VALUES(100000, 10000, 'Transfer', 100.00, 'USD', TO_DATE('2024-02-15', 'YYYY-MM-DD'), 'Completed', 'Transfer from savings');

UPDATE CUSTOMERS
SET CITY = 'New Delhi'
WHERE CUSTOMER_ID = 10000;

INSERT INTO TRANSACTIONS VALUES(100001, 10000, 'Transfer', 100.00, 'USD', TO_DATE('2024-02-15', 'YYYY-MM-DD'), 'Completed', 'Transfer from savings');

UPDATE TRANSACTIONS
SET STATUS = 'Pending'
WHERE TRANSACTION_ID = 100000;

--DELETE FROM CUSTOMERS
--WHERE CUSTOMER_ID = 10000;
COMMIT;

Successful.

@chubei chubei marked this pull request as ready for review February 19, 2024 01:13
@chubei chubei merged commit b74a7c2 into getdozer:main Feb 19, 2024
4 checks passed
@chubei chubei deleted the feat/aerospike_consistency branch February 19, 2024 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants