Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BQ: review #21

Open
3 of 16 tasks
laszewsk opened this issue Oct 19, 2020 · 2 comments
Open
3 of 16 tasks

BQ: review #21

laszewsk opened this issue Oct 19, 2020 · 2 comments
Assignees
Projects

Comments

@laszewsk
Copy link
Member

laszewsk commented Oct 19, 2020

  • location of old bigquery command before merger to google:
  • the implementation of bigquery needs to be completed.
  • the command should be google bigquery
  • there are multiple directories with bigquey including one that has a namespace init which seems wrong. Thta init should only be in the cloudmesh dir
  • the code must be tested and working
  • other commands need to be tested if they still work after you do your changes
  • development must be in branch bigquery
  • after things pass we merge carefully into main and redo all tests
@laszewsk
Copy link
Member Author

Make sure you are isn sync with main, please remember that master is no more and you must not use the clone that has master in it or you reintroduce it again, you need a fresh clone if you not already have done so.

a) create a branch

“bigquery”

And do your checks in there

=========

b) your documentation in the readme is insufficient as it is not possible for me to figure out what the yaml file is and so on. This documentation is based on you as an expert, but can not be understood by me. Likely you already have additional documentation in maybe a file called README-bigquery.md

Please provide in there a user manual (not using screenshots for code and directory structure) but just using code her

c) the test needs a verbal description of what you do

d) we need a more elaborate test that uses some data to benchmark this bigquerry command. Not sure what that will be, but it should not just dependent on big query, but should be replicated with another framework on you local machine plus another cloud that has a similar service than big query.

c) you have to rerun the google vm and storage code to verify if your integration does not break that code.
I think we have tests for that in cloudmesh-cloud

Sorry but because the others completed their code they are no longer there to run and verify (c)

I do not think that any of this is difficult but it does require you doing the integration.

By the way we had 2-3 students this semester verifying that google vm management does work, so there is great hope this will be easy.

=========
Also make sure that you not just say and here is my big query command without providing a reasonable documentation to what big query is and how you use it from cloudmesh. You aslos need to contrast why someone would use the cloudmesh bigquerry command. The motivation is actually that you need to look into that you can run it on different clouds such as

cms set cloud=google
cms bigquery ……

cms set cloud=local
cms bigquery ……

cms set cloud=aws
cms bigquery ……

So in essence we have two command and repos.

One in cloudmesh-google
Another one in cloudmesh-bigquery

The difference is that the one in google has an API that is used to implement the command in cloudmesh-google as well as in cloudmesh-bigquery. All google related code will than be in the google related directory

The local can be implemented in big query.

I will also than if this work introduce a new bundle that loads

cms
cloudmesh-google
cloudmesh-bigquery

The structure on how to factor it is actually coming from a documented use case that you will need to define. Remember this is about multi clouds one of which must be google and local in your case. Most students have also implemented their activity for a second cloud to showcase the differences in a benchmark.

Contact me if unclear. I like to help and getting this right from the start is important. So point me to your documented use case so we can avoid you being stuck. Communication is important.

@laszewsk laszewsk changed the title bigquery BQ: review Dec 23, 2020
@laszewsk
Copy link
Member Author

The issue is that we need both but we start with the googe

a) First we need the API in google for the provider, as we need to makes sure the authentication and the library works also with VM and storage
b) once we have the API its trivial to develop a command that doe not have google in it, but we start with google as this is easy to implement anyways
c) Once we are satisfied with google we can build multi cloud support.
d) the issue in the big query command is that the loading of the google provider is do ne on demand, just as we do in cloudmesh-cloud., There we based on the set cloud=. Command we detect which cloud is set and we load the API dynamically, thus we can easily switch between clouds and use query to multiple databases.

So the issue is not

Cms google bigquery -> goes to google
Cms bigquery -> goes to google

But the architecture is actually

Cms bigquery -> read which cloud -> dynamically load provider based on cloud -> issue big query to provider

This is a very different architecture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants