-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Auto ML operators for Vertex AI service #21470
Conversation
4a541c5
to
809af22
Compare
class VertexAIModelLink(BaseOperatorLink): | ||
"""Helper class for constructing Vertex AI Model link""" | ||
|
||
name = "Vertex AI Model" | ||
|
||
def get_link(self, operator, dttm): | ||
model_conf = XCom.get_one( | ||
key='model_conf', dag_id=operator.dag.dag_id, task_id=operator.task_id, execution_date=dttm | ||
) | ||
return ( | ||
VERTEX_AI_MODEL_LINK.format( | ||
region=model_conf["region"], | ||
model_id=model_conf["model_id"], | ||
project_id=model_conf["project_id"], | ||
) | ||
if model_conf | ||
else "" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as in #21267 (comment) , let's try to be consistent
project_id: str, | ||
region: str, | ||
display_name: str, | ||
labels: Optional[Dict[str, str]] = None, | ||
training_encryption_spec_key_name: Optional[str] = None, | ||
model_encryption_spec_key_name: Optional[str] = None, | ||
# RUN | ||
training_fraction_split: Optional[float] = None, | ||
test_fraction_split: Optional[float] = None, | ||
model_display_name: Optional[str] = None, | ||
model_labels: Optional[Dict[str, str]] = None, | ||
sync: bool = True, | ||
gcp_conn_id: str = "google_cloud_default", | ||
delegate_to: Optional[str] = None, | ||
impersonation_chain: Optional[Union[str, Sequence[str]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that some time ago Google recommended to avoid such interfaces because there was a lot of issues like "can we add foo_bar argument to operator XYZ". Instead it was suggested that an operator should accept a "body" - an object that is accepted by the underlying API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with both approaches as long as Google want to maintain it ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaksYermak Is this consistent with approach that we have in other google operators?
I think it should be good as long as we consistent everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kosteev Yes it is consistent. We have the same approach in other google operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaksYermak @kosteev it's a matter of definition: For example Google AutoML operators accept model
argument instead of particular elements of model as we do here.
model: dict, |
Some time ago we had a move that resulted in refactoring Dataproc and BigQuery (as well others) operators to follow "single input" approach because it was considered more generic and easier to follow. Also it is more inline with Google change between approach in v1 and v2 clients: https://googleapis.dev/python/dataproc/latest/UPGRADING.html#method-calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine to go with it. this looks like a nicely generated piece of code, doesn't it?
Looks like there are some import errors though :( |
The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see this comment, let's try to be consistent 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see this comment, let's try to be consistent 👌
809af22
to
1f0ac36
Compare
@potiuk @turbaszek I have updated links for Vertex AI with new approach. One more thing I with @lwyszomi and @wojsamjan have decided to create a separate package for links. @potiuk @turbaszek what do you think about it? |
1f0ac36
to
ecdfa1d
Compare
ecdfa1d
to
3e8184f
Compare
should have a transformation. If an input column has no transformations on it, such a column is | ||
ignored by the training, except for the targetColumn, which should have no transformations | ||
defined on. Only one of column_transformations or column_specs should be passed. Consider using | ||
column_specs as column_transformations will be deprecated eventually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaksYermak can you please clarify this deprecation eventually?
This is the first time we added this operator why do we warn about deprecation (and if there is something that user needs to be aware of why do we warn only in docstring?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eladkal thank you for your comment. It is my mistake I haven´t noticed that this parameter is deprecated in the google-cloud-aiplatform library. I will create PR with deprecation warning for this parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create operators for working with Auto ML for Vertex AI service. Includes operators, hooks, example dags, tests and docs.
Co-authored-by: Wojciech Januszek januszek@google.com
Co-authored-by: Lukasz Wyszomirski wyszomirski@google.com
Co-authored-by: Maksim Yermakou maksimy@google.com
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.