Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Populating huge data presets #729

Closed
sharkzp opened this issue Dec 11, 2014 · 2 comments
Closed

[QUESTION] Populating huge data presets #729

sharkzp opened this issue Dec 11, 2014 · 2 comments

Comments

@sharkzp
Copy link

sharkzp commented Dec 11, 2014

Hi everyone,

We have came across very painful setup.
When we try to test our reports or any other complex queries we experience that setup could be very huge and complex to maintain and support in a future. For example:

SELECT COUNT(DISTINCT order_items.id) AS order_items,
       DATE(order_items.created_on)   AS date
FROM orders
JOIN order_items                  ON order_items.order_id = orders.id
JOIN shipments                    ON shipments.id = order_items.shipment_id
JOIN transactions                 ON transactions.order_id = orders.id
JOIN customers                    ON customers.id = orders.customer_id
LEFT JOIN flagged_transactions ON flagged_transactions.transaction_id = transactions.id
WHERE orders.state NOT IN ('a', 'b', 'c', '')
 AND customers.state NOT IN ('fraud', 'maybe_fraud')
 AND shipments.state NOT IN ('cancelled', 'shipped')
 AND shipments.vendor_id = %{some_vendor}
 AND order_items.quantity > 0
 AND order_items.state IN ('initial', 'approve')
 AND order_items.created_on BETWEEN %{start_date} AND %{end_date}
 AND (flagged_transactions.id IS NULL OR flagged_transactions.state NOT IN ('pending'))
GROUP BY date
ORDER BY date

You could imaging how huge will be canonical setup with factories. In this light maybe you could suggest best practice or uncommon usage of FactoryGirl that we are missing?

@joshuaclayton
Copy link
Contributor

@sharkzp with this structure, it's difficult to model FG in such a manner where you'd be able to capture each and every state.

I'd probably start with obvious differences and create traits - given this query, capturing non-fraudulent customers, unshipped orders, approved order items, unflagged transactions - to capture these concepts.

I'd instead focus on simplifying as much logic as possible or teasing this query into multiple and testing individual areas, instead of trying to create data to cover every single permutation to ensure the query itself is correct.

@sharkzp
Copy link
Author

sharkzp commented May 8, 2015

@joshuaclayton unfortunately that(slicing query) impossible to do :) Our DB is very huge (few TB) and we cannot allow ruby to handle big amount of records/data because of performance. The only thing that we could do now is to create few big factories and tied them together to suffice our requirements. Thanks a lot for your suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants