-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of testing Providers that were prepared on November 15, 2022 #27674
Comments
Note that |
Attempted to test One workaround would be to update
|
Actually, the RC versions should also support dependencies to other rc packages. but I will check it later (on the run now). You can workaround and install both packages with |
@r-richmond I cannot reproduce it. Where did you install the rc1 package from ? I just double checked and our RC packages (in PyPI) have mechanism implemented to cope with the issue you mentioned and they properly depend on .* version in case of such a cross-dependency:
The When I try to install
Note: So I wonder where your errors came from? |
OK. I think I know. From the error message, seems you are using poetry @r-richmond. Poetry is not supported to install airflow, and it has apparently a bug that it cannot properly handle the vaild >= .* requirement. You MUST use |
@potiuk The following work great: Provider ssh: 3.3.0rc1 |
@potiuk The following work great: |
We are getting the below error when trying to install the Kubernetes RC via
Any hint that could help us solve this problem? The same via pip install works. But somehow it is failing with setup.cfg only for this RC, all other RCs work fine via |
#26784 - Tested on actual both AWS SSM Parameter Store and AWS Secrets Manager and circular error gone (I hope forever) #26687 - Tested non-doc parts
#26946 - Assume role with credentials fixed and work as initial expected #27134 - Most of changes related to documentation but anyway test that obtain connection in case if it stored in AWS SSM as JSON #26953 - Works as expected |
My guess you might have the other RCs already installed locally and explicitly that's why they are found by
|
Thanks Jarek. I am sorry for the bother here, looks like it is an internal Astronomer issue we have because of a constraint we have on the Kubernetes provider releases. @kaxil helped us figure this out, so thanks to Kaxil too :) |
@potiuk This particular issue is fixed in |
Of course - it's the installation of new provider bumped trino upgrade. If you already had an earlier version of trino library installed the only way to upgrade it would be to manually upgrade the trino library. If we did not have >= 0.318 it would not have happened. |
Make GSheetsHook return an empty list when there are no values (#27261) works as expected. |
@potiuk the yandex provider changelog has a note about a 4.0 release since i thought that we would be bumping (as was the practice at the time). but actual release is 3.2.0 so the release notes need to be fixed. confirmed it's in the release package. i am going through all of them now i'll compile fixes here #27726 all fixes added to that PR:
|
Ok. Thanks. Missed it. I will remove yandex and make a follow-up release.
I will leave as it is. It was originally marked as breaking, and maybe even if it is not, it's still OK from SemVer point of view (we are ok to bump major version any time we want, regardless if there are breaking changes). This is at most small annoyance.
Will leave it as is, This can be corrected in the docs (in the next version), it will stay in the PIP/Package README though. But again, it's minor inconvenience. |
ok i'll keep compiling things in that PR and yeah, somethings can be deferred for sure |
ok i completed review and all suggestions are in that PR |
ok @potiuk, in addition to yandex, jdbc references a non-existent release in the changelog but the other suggestions i think can be docs-only |
#27184 Tested - looks good |
Thanks everyone. Closing this one and releasing most providers (see announcement at devlist for details)l |
Unfortunately I'll have time to test Databricks provider only on weekend |
Tested #26951 - Works as expected |
We can always fix in upcoming release if we find problems @alexott :D |
Can we revert this release of databricks provider until this issue is investigated & fixed? It's really bad that refactoring was done without testing of affected operators. |
We always welcome the community to test. This is why we have this tracker before release. I dont think there is need to yank the release because one operator stopped working. Affected users can use older version of provider while a fix is in progress till next release. |
The problem here was that voting was open only during the work week, not over the weekend as usual. And I have time only over weekend. I agree about having test coverage, but it work in progress. P.S. It looks like that #23971 broke saving results to file |
We are always looking for ways to improve :) |
Hello @alexott - I have always cvalued the work of yours. And please don't treat that personally. This comment has no personal attack on you in any way - it's more towards the organization of yours, but your comment come to me as a bit of shock. And I presume you are speaking in the name of Databricks as an organisation here. If not, I am even a bit more shocked. Are you seriously saying that the work of volunteers that prepare and release the ASF software mostly in their free (or sponsored by some responsible players) should be impacted to delay/no time that commercial company - Databricks - interested in their (Databricks) intergration could not afford any of their employees to spend any time on it during their day job? Do I interpret your words well? Are you seriously saying that Databricks has an expectation that their integration will Airflow works and get tested, but they are not able to allocate time of their employees working on it to verify it during their day job, and instead they expect that Airflow PMC process should be changed to impact (and delay) general release process of 70+ providers because of that? If that's the case, I am literally shocked - even stunned. It shows great misunderstanding of the role and contributions that the company can make to the project that they deeply care about. (again - I have nothing to you personally, It's a comment to the organization of yours).
There is no "usual" thing here. I checked and in the past sometimes voting on Sat, sometimes on Mon, but sometimes on Thu. The 72 hours that the ASF requires is there for a reason. It should give not only a lot of time for reaction, but also should account for geographical locations https://www.apache.org/foundation/voting.html
Yes. There is "at least", but there is nothig about "make sure it goes through weekend". Surely companies like Databricks have on-call-rotation with "hours" SLAs for their critical problems, and giving 72 Hrs notice is far more relaxed and give quite a lot of time to be able to react for the company/team that takes care of it. Do I understand the situation correctly? Just another comment there - If your organisation wants to really have a control over the release process and testing - it's perfectly possible. We could actually let databricks release their own provider (not as a community one). Databricks is free to do that - we can move the "apache-airflow-providers-databricks" and let databricks release their own "databricks-airflow-provider" (same as Great Expectations did). Then you will have fulll control over release, testing and changes. You (Databricks) can do it very easily. Being part of the community and ASF-ruled project is that you take both benefits and limitations that are coming from that. We do all the release process, we have a number of people (like @kazanzhy ) who contribute and improve the stuff there. This all comes as a benefit for Databricks, as they don't have to necessary put their own effort there into improving and maintaining the libraries. And it comes with the cost - the cost is that release process of ASF has to be followed. The cost is that if you should make sure the code is sufficiently covered by unit tests. Finally the cost is that when we put it up for release Databricks have 72 hrs to see if they are ok with the release - with all the prior notifications, warnings and even listing detailed list of changes so that Databricks can focus just on this. And even more - with the AIP-47, you have an apportunity (as Databricks) to create, maintain, develop and RUN system tests for your provider. We have a very well defined way how to build and add those tests, yet we EXPECT Databricks ( similarly as Amazon and Google who pave the way) to eventually run and report the status of their end-2-end integration of the provider. That would have caught the problem way earlier. And you are absolutely free to invest your time and effort to make those examples/system_tests fully automatically runnable, have an account to run them and actually run and report result of those on whatever frequency you think is appropriate. This is something Databricks could do today. Just needs to invest. And this is the way how the "testability" of the Databricks provider can be improved. Not by shifting voting to weekends. I'd really love that Databricks does the investment there to improve the test quality tehre. If they care about the quality of their integration - AIP-47 is precisely about that. And the best way of doing it. |
And yes - it is the weekemd, where I am on the woods and I just yanked the release (it's reversible as well): I am planning the follow-up release right after I am back (Monday afternoon) so if we will find and investigate/fix the issue before that - I am happy to include 3.4.1 databricks provider in this wave. |
The following work as expected (better late than never) |
@eladkal - I think yanking is fine for that -> this is really a "soft" way of making releases not "searchable" when doing upgrades. People still can choose to install them manually, no problem. And sinece we have not included the new providers in "2.4.3" - it has no impact for the default bugfix installations/image/constraints versions. I personally believe we should use yanking more frequently (and have no objections for it if there is a good reason) - for releases we do not want tolet users install "a new" - because we know they are somewhat seriously flawed. |
I'm not sure I agree. |
(but this is not something I feel strongly about. Just my own view) |
@potiuk sorry for adding more work on your shoulders, and I appreciate your work & other maintainers. We still don't have on-call for Airflow operators. Work on improving test coverage, and setting up continuous testing is prioritized. |
@alexott - Also to explain more (and apologise, if it's been a little - or even little more than a little - harsh). But I think this is really just question of balance between the risk we will break something and the time/effort of somone's (who?) to do enough testiing to be confident enouhg. There is never 100%. We cannot extend the testing period for very long time because it will hamper our "operations". And we do not know who is / is not available for testing. We cannot really have a point to say "we wait for response". The most we can do is to make all the reasonable effort to flag and notify everyone involved and give them SOME time to react. Is 72 hours enough? Should it be 96 maybe? Should we always have weekend included? To be honest I was quite surprised to see the statement of yours - usually we made voting last 1 more day if the weekend was included to account that people will NOT have time to do testing then because usually they spend weekends with their families etc. I often (my personal choice) do some CI stuff during wekends because then it's faster and safer to iterate on it) and quite often I shift my working/non-working days freely because I have very flexible situation (no day job, no kids, wife that has also super flexible schedule), but I would not think this is "needed" by others :) Do you really think having a weekend as part of the voting time for providers is a good idea and way to improve the quality of testing in general? I think we COULD make sure we always start voting on Wed and finish on Tuesday, I just have feeling that would be rather artifficial setting for one specific case only. I often did adjust the relase schedule to various events (for example I would never start voting this Wednesday - knowing that ThanksGiving is coming),. Maybe this is also question to others in this thread. We have quite representative group in this thread - do you think that would be a good idea to add some regularity in the day-of-week process? |
@eladkal I looked at the change again, and I think I want to keep my approch - yanking - especially if we can fix it quickly - @alexott @kazanzhy ? Can we have a fix for the problem today/tomorrow? I would then release a fixed version. Why I think we should yank it ? SQL interface is an extremely important part of any modern data interface. Even if this is just SQLExecute, people will rely on it heavily and by upgrading to the new Databricks provider, a lot of the workflows might be broken. Databricks is popular and important part of our ecosystem. With this bug, we will not only impact Databricks "view" but also Airflow's one and we might have to deal with many issues raised by our users. Yanking is the way to prevent those issues from happening in most cases. But it also does not prevent those who want to use new features released in the new Databricks Provider. They STILL can install the provider if they specify Especially if we will be able to release a fix in few days, this will be only a minor inconvenience that users won't be able to use the new features of Databricks provider "automatically". Compare that with potential "radius blast" of broken SQL operator for Databricks. It's like few orders of magnitude more serious problem that our users might have to deal with if we do not yank it now. So @kazanzhy @alexott - is it possible to fix this problem quickly ? If so - I am happy to wait with the RC2 candidate release I am going to do for those providers that we skipped in last wave until this one is tested and merged. |
Body
I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?
Let us know in the comment, whether the issue is addressed.
Those are providers that require testing as there were some substantial changes introduced:
Provider alibaba: 2.2.0rc1
Provider amazon: 6.1.0rc1
preserve_file_name
param toS3Hook.download_file
method (#26886): @alexkrucGoogleApiToS3Operator
: addgcp_conn_id
to template fields (#27017): @syedahsnexisting_jobs_found
(#27456): @ferruzziget_ui_field_behaviour
(#27533): @pankajastroget_table_primary_key
method (#27330): @pankajastroProvider apache.beam: 4.1.0rc1
Provider apache.drill: 2.3.0rc1
Provider apache.druid: 3.3.0rc1
Provider apache.hive: 4.1.0rc1
Provider apache.livy: 3.2.0rc1
appId
to xcom output (#27376): @bdsohaProvider apache.pig: 4.0.0rc1
Provider apache.pinot: 4.0.0rc1
Provider apache.spark: 4.0.0rc1
Provider arangodb: 2.1.0rc1
Provider asana: 3.0.0rc1
Provider cncf.kubernetes: 5.0.0rc3
resource
as dict inKubernetesPodOperator
(#27197): @eladkalProvider common.sql: 1.3.0rc1
Provider databricks: 3.4.0rc1
Provider docker: 3.3.0rc1
Provider exasol: 4.1.0rc1
Provider google: 8.5.0rc1
_bq_cast
tobq_cast
(#27543): @pankajastroid
key to retrieve the dataflow job_id (#27336): @dejiiProvider grpc: 3.1.0rc1
extra__
instead ofextra_
inget_field
(#27489): @dstandishProvider hashicorp: 3.2.0rc1
Provider jdbc: 3.3.0rc1
extra__
instead ofextra_
inget_field
(#27489): @dstandishProvider microsoft.azure: 5.0.0rc1
extra__
instead ofextra_
inget_field
(#27489): @dstandishProvider microsoft.mssql: 3.3.0rc1
Provider microsoft.winrm: 3.1.0rc1
Provider mongo: 3.1.0rc1
Provider mysql: 3.3.0rc1
Provider opsgenie: 5.0.0rc1
Provider oracle: 3.5.0rc1
Provider postgres: 5.3.0rc1
Provider salesforce: 5.2.0rc1
Provider sendgrid: 3.1.0rc1
Provider sftp: 4.2.0rc1
Provider slack: 7.0.0rc1
Provider snowflake: 4.0.0rc1
Provider sqlite: 3.3.0rc1
Provider ssh: 3.3.0rc1
Provider tableau: 4.0.0rc1
Provider trino: 4.2.0rc1
Provider vertica: 3.3.0rc1
Provider yandex: 3.2.0rc1
Provider zendesk: 4.1.0rc1
The guidelines on how to test providers can be found in
Verify providers by contributors
Committer
The text was updated successfully, but these errors were encountered: