[QUESTION] Populating huge data presets #729

sharkzp · 2014-12-11T16:41:39Z

Hi everyone,

We have came across very painful setup.
When we try to test our reports or any other complex queries we experience that setup could be very huge and complex to maintain and support in a future. For example:

SELECT COUNT(DISTINCT order_items.id) AS order_items,
       DATE(order_items.created_on)   AS date
FROM orders
JOIN order_items                  ON order_items.order_id = orders.id
JOIN shipments                    ON shipments.id = order_items.shipment_id
JOIN transactions                 ON transactions.order_id = orders.id
JOIN customers                    ON customers.id = orders.customer_id
LEFT JOIN flagged_transactions ON flagged_transactions.transaction_id = transactions.id
WHERE orders.state NOT IN ('a', 'b', 'c', '')
 AND customers.state NOT IN ('fraud', 'maybe_fraud')
 AND shipments.state NOT IN ('cancelled', 'shipped')
 AND shipments.vendor_id = %{some_vendor}
 AND order_items.quantity > 0
 AND order_items.state IN ('initial', 'approve')
 AND order_items.created_on BETWEEN %{start_date} AND %{end_date}
 AND (flagged_transactions.id IS NULL OR flagged_transactions.state NOT IN ('pending'))
GROUP BY date
ORDER BY date

You could imaging how huge will be canonical setup with factories. In this light maybe you could suggest best practice or uncommon usage of FactoryGirl that we are missing?

The text was updated successfully, but these errors were encountered:

joshuaclayton · 2015-05-08T16:18:34Z

@sharkzp with this structure, it's difficult to model FG in such a manner where you'd be able to capture each and every state.

I'd probably start with obvious differences and create traits - given this query, capturing non-fraudulent customers, unshipped orders, approved order items, unflagged transactions - to capture these concepts.

I'd instead focus on simplifying as much logic as possible or teasing this query into multiple and testing individual areas, instead of trying to create data to cover every single permutation to ensure the query itself is correct.

sharkzp · 2015-05-08T16:29:15Z

@joshuaclayton unfortunately that(slicing query) impossible to do :) Our DB is very huge (few TB) and we cannot allow ruby to handle big amount of records/data because of performance. The only thing that we could do now is to create few big factories and tied them together to suffice our requirements. Thanks a lot for your suggestions.

joshuaclayton closed this as completed May 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Populating huge data presets #729

[QUESTION] Populating huge data presets #729

sharkzp commented Dec 11, 2014

joshuaclayton commented May 8, 2015

sharkzp commented May 8, 2015

[QUESTION] Populating huge data presets #729

[QUESTION] Populating huge data presets #729

Comments

sharkzp commented Dec 11, 2014

joshuaclayton commented May 8, 2015

sharkzp commented May 8, 2015