-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading to airflow 2.4.0 from 2.3.4 causes NotNullViolation error #26497
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
ah, seems to be the "backend version" of the same issue #26505 ----- update ----- |
Hello, we also facing this problem. Existing dags are showing up in the Airflow UI but we can't create a new dag. We also use the officel Helm chart to deploy Airflow. We are using a postgres DB as backend. My feeling is that this is not related to #26505. |
I don’t think this is related to #26505, it looks like something in the permission stack not working well with AIP-48 stuff? |
We saw the same problem this past week with in an attempted upgrade to 2.4 from 2.1.0. We are using Ubuntu and Postgres in AWS with the scheduler/webserver on an EC2 instance sending work to a Kubernetes cluster on EKS. We tried both DAGs show up in the CLI and in the UI, but can't be found when you try to view job details in the UI. I noticed that the scheduler constantly throws errors about jobs not being in the serialized DAG table. Manually running I used This worked and I'm including it here in case it helps diagnose the root cause, but I have no idea if this introduced other issues and am hesitant to promote this to our upper environments without more research. |
Question: while this looks bad, does it actually cause any process to exit with an error? |
I upgraded our server from Ubuntu 18.04 to 20.04.5 and tried installing 2.4.0 again. The same issue persisted, so we've rolled back to 2.3.4. I have another server on 18.04 that successfully managed to upgrade from 2.3.4 to 2.4, so if you need me to make any comparison between the two servers (although the failing one is now on 20.04) to get to the root cause, let me know. |
This will be caused by a difference in the rows in the database, not in the OS version. Is there any chance you could (privately if needed) share a DB dump of the breaking install? |
@ManikandanUV What version of |
@ashb Working server: |
@ashb how do I get the db dump? The servers have different dags, so they're not comparable in that aspect. Also, the working server started with 2.x (not sure which one, may be 2.2)->2.3.4->2.4, but the failing one was 1.x->1.10.15->2.3.4 |
@ashb We have experienced that problem as well while testing to 2.4.0 upgrade (from 2.2.4). We use centos 7 for airflow. |
From what version did you start using airflow? Asking so I can try to upgrade from that version to 2.3.4 to 2.4.0 |
I think we started from version I tested it on new 2.2.4 deployment and upgraded it to 2.4.0. I have experienced the same problem. |
@ashb I attached examples of the two errors I saw in case it's helpful. |
We are seeing the same issue upgrading from
|
Can anyone give us reproduction steps? Cos trying this with a "minimal" 2.3.4 and the upgrading to 2.4.0 neither myself nor @ephraimbuddy have been able to reproduce this, so we're not sure what step we're missing. (And until we can reproduce it, we can't fix it) |
@ashb try our upgrade path, may be that's the key. I have two servers, one started at 1.x and other started at 2.x. The 2.x server upgraded to 2.3.4 and 2.4 without issues, where as the 1.x server upgraded to 2.3.4 and is now failing 2.4 upgrade from 2.3.4 |
Thanks for the tip @sterling-jackson. Your suggestion resolved this one for us. If you use this solution alter the
|
Piling on here. I'm seeing the same thing. Seems to maybe be related Datasets?
We started on version 1.10.xx too and have been upgrading with each release. Edit: I see now, there is a Datasets views that is added. |
We started with 1.8.xx, went to 1.9.xx, 1.10.xx, and somehow all of our FAB tables ended up without sequences set for their IDs, but had the sequences created. We were seeing similar issues in 2.4.0, and manually ran: ALTER TABLE "public"."ab_permission_view" ALTER COLUMN "id" SET DEFAULT nextval('ab_permission_view_id_seq'::regclass);
ALTER TABLE "public"."ab_permission" ALTER COLUMN "id" SET DEFAULT nextval('ab_permission_id_seq'::regclass);
ALTER TABLE "public"."ab_permission_view_role" ALTER COLUMN "id" SET DEFAULT nextval('ab_permission_view_role_id_seq'::regclass);
ALTER TABLE "public"."ab_register_user" ALTER COLUMN "id" SET DEFAULT nextval('ab_register_user_id_seq'::regclass);
ALTER TABLE "public"."ab_role" ALTER COLUMN "id" SET DEFAULT nextval('ab_role_id_seq'::regclass);
ALTER TABLE "public"."ab_user" ALTER COLUMN "id" SET DEFAULT nextval('ab_user_id_seq'::regclass);
ALTER TABLE "public"."ab_user_role" ALTER COLUMN "id" SET DEFAULT nextval('ab_user_role_id_seq'::regclass);
ALTER TABLE "public"."ab_view_menu" ALTER COLUMN "id" SET DEFAULT nextval('ab_view_menu_id_seq'::regclass); Which resolved our issue. |
Thanks @joshowen that's very helpful |
We have the same problem after upgrading from 2.3.4 to 2.4.1 Thanks @joshowen, these commands resolved the problem. |
Okay, I've found the source of the confusion, and the path needed to trigger this behaviour. Run airflow webserver with < 1.10.13 in RBAC mode, where FAB creates it's tables. In 1.10.13 we introduces a migration that creates the tables with the server_default but that migration only did anything if the tables didn't already exist. But the tables created by the FAB model have a default (but not a server_default). Oh, and the final bit of the puzzle, in 2.4 we finally "took control" of the FAB security models in to airflow and those do not have the I'll work on a new migration to fix this up. |
Ran into the same issue. We also started in 2017 which should be around version 1.8 something and have upgraded since then. I ran the commands @joshowen posted and now I can open the DAG again. I am getting a |
not until a new user couldn't register did i realized i was having this issue and a ton of such error messages in the logs 😂 |
We ran into this issue upgrading from 2.3.1 to 2.4.1 so it doesn't seem the issue is fixed yet. These table alterations resolved the problem though. |
@ashb These must be run before running |
Argh! Good catch |
Apache Airflow version
2.4.0
What happened
Stopped existing processes, upgraded from airflow 2.3.4 to 2.4.0, and ran airflow db upgrade successfully. Upon restarting the services, I'm not seeing any dag runs from the past 10 days. I kick off a new job, and I don't see it show up in the grid view. Upon checking the systemd logs, I see that there are a lot of postgress errors with webserver. Below is a sample of such errors.
I tried running airflow db check, init, check-migration, upgrade without any errors, but the errors still remain.
Please let me know if I missed any steps during the upgrade, or if this is a known issue with a workaround.
What you think should happen instead
All dag runs should be visible
How to reproduce
upgrade airflow, upgrade db, restart the services
Operating System
Ubuntu 18.04.6 LTS
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: