Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

field name change between database stack deploys #22

Open
leecookson opened this issue Jun 8, 2022 · 4 comments
Open

field name change between database stack deploys #22

leecookson opened this issue Jun 8, 2022 · 4 comments

Comments

@leecookson
Copy link

I created 2 cloudformation stacks for this a few years ago, and the schema fields that represent multiple words like requestip and hostheader are single words as typed here.

In December 2019, the fields were updated to use snake_case to conform to AWS Cloudfront specs:
c76192a

I created a 3rd stack for new product sites after this, and the schemas don't match. I have a service that runs queries to find bot crawl numbers, and I'm going to have to come up a way to construct the WHERE clauses to differ based on which database is accessed.

Is there a way to safely modify the older 2 stacks to the new snake_case field names?

@steffeng
Copy link
Member

Hi @leecookson, sorry to hear that the stack did not cope well with the schema evolution. Can you confirm, please: Did the data in the two older stacks change at some cut off date from requestip to request_ip for new data deliveries (if so, when?) or does it continue to be requestip?

@leecookson
Copy link
Author

the original stacks still have the pre-scheme-change field names. I've adapted by configuring things like host_where_clause clause and user_agent_field_name along with product-specific query configuration. These values get spliced into the SELECT statements so they work properly for both old and new stacks. Since I already had config per-product for database name and host header value, adding these wasn't a big problem, but still an awkward compromise.

I've never re-deployed the cloud formation stacks once deployed, and if I had, I guess it could have broken existing queries using the no-hyphen field names.

@steffeng
Copy link
Member

I haven't tried it myself, but if you want to create a new stack for the old data, you need to recreate and reconvert the partitions with a script (or an AWS Glue crawler) from the gzipped files to get rid of all old schema names.

The easier way if you just want to query the old stacks and the new ones with a uniform query, is to wrap the old combined view with a new view such as:

CREATE OR REPLACE VIEW "combined2" AS 
SELECT
  requestip request_ip,
  hostheader host_header
  [...]
FROM
  combined

Would the latter work in your case?

@steffeng
Copy link
Member

@leecookson, could you resolve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants