Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors on Engine pods after setting up ipam on AKS #225

Closed
cpareek opened this issue Jan 31, 2024 · 9 comments
Closed

Errors on Engine pods after setting up ipam on AKS #225

cpareek opened this issue Jan 31, 2024 · 9 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@cpareek
Copy link

cpareek commented Jan 31, 2024

We have setup ipam on aks using below examples deployments.
#32

Both engine and ui pods are running fine and we also got connection from engine pod to the cosmos database.

Although on the engine pod, we are seeing below errors, any suggestion to solve them?

ipam-engine-7645864887-tnfzb -n ipam INFO: Will watch for changes in these directories: ['/ipam'] INFO: Uvicorn running on http://0.0.0.0:80 (Press CTRL+C to quit) INFO: Started reloader process [8] using WatchFiles 2024-01-31 14:03:11.857 | INFO | uvicorn.server:serve:76 - Started server process [10] 2024-01-31 14:03:11.857 | INFO | uvicorn.lifespan.on:startup:46 - Waiting for application startup. 2024-01-31 14:03:11.858 | INFO | app.main:set_globals:363 - Creating Database... 2024-01-31 14:03:12.005 | INFO | app.main:set_globals:374 - Creating Container... 2024-01-31 14:03:12.489 | INFO | app.main:db_upgrade:189 - No existing spaces to convert... 2024-01-31 14:03:12.503 | INFO | app.main:db_upgrade:209 - No existing users to convert... 2024-01-31 14:03:12.569 | INFO | app.main:db_upgrade:229 - No existing admins to convert... 2024-01-31 14:03:12.633 | INFO | app.main:db_upgrade:248 - No existing user objects to patch... 2024-01-31 14:03:12.638 | INFO | app.main:db_upgrade:268 - No existing admin objects to patch... 2024-01-31 14:03:12.657 | INFO | app.main:db_upgrade:294 - No existing reservations to patch... 2024-01-31 14:03:12.662 | INFO | app.main:db_upgrade:321 - No existing External CIDRs to patch... 2024-01-31 14:03:12.663 | INFO | uvicorn.lifespan.on:startup:60 - Application startup complete. 2024-01-31 14:03:13.669 | ERROR | app.main:find_reservations:395 - Error running network check loop! 2024-01-31 14:04:13.733 | ERROR | app.main:find_reservations:395 - Error running network check loop! 2024-01-31 14:05:13.813 | ERROR | app.main:find_reservations:395 - Error running network check loop! 2024-01-31 14:06:13.867 | ERROR | app.main:find_reservations:395 - Error running network check loop! 2024-01-31 14:07:13.945 | ERROR | app.main:find_reservations:395 - Error running network check loop! 2024-01-31 14:08:14.010 | ERROR | app.main:find_reservations:395 - Error running network check loop!

@DCMattyG
Copy link
Contributor

Hey @cpareek, thanks for reaching out on this issue!

I'm glad you were able to follow the examples another user shared as to how to get the Azure IPAM project running inside AKS. I am curious as to how you setup the required Service Principals and there associated permissions, as well as how you're passing those variables/secrets into the containers within AKS.

Are those objects something you manually created?

@DCMattyG DCMattyG added the help wanted Extra attention is needed label Jan 31, 2024
@DCMattyG DCMattyG self-assigned this Jan 31, 2024
@cpareek
Copy link
Author

cpareek commented Feb 1, 2024

Hi @DCMattyG - Thanks for the response.

Yes so the Service principals and associated permissions are created using following documentation - https://azure.github.io/ipam/#/deployment/README . We already have working ipam setup which is hosted on Web App.
So I am using the same service principal which is working on Web app for the AKS setup.
I can see the Environment variables set here - https://github.com/Azure/ipam/blob/main/deploy/appService.bicep#L87-L135
and on my AKS, engine-deployment.yaml file, I got this below environment variable setup which is fetching the values as Secret on the same namespace. Because all my secrets are stored as Secret on k8s, I thought Keyvault is not needed hence no KEYVAULT_URL setup. Let me know if you think I am missing something

COSMOS_KEY
COSMOS_URL
ENGINE_APP_ID
ENGINE_APP_SECRET
TENANT_ID
UI_APP_ID

I think my engine pod can connect to cosmos DB fine, and I can see the database has been created on the cosmos DB account.

@cpareek
Copy link
Author

cpareek commented Feb 5, 2024

@DCMattyG Any thoughts on above please? I can see the error is coming up from this python script here - but unsure why its throwing that..

logger.error('Error running network check loop!')

@DCMattyG
Copy link
Contributor

DCMattyG commented Feb 5, 2024

Hi @cpareek, have you checked that the Environment Variables are correctly mapped to their respective containers with the appropriate names?

If you take a look at the Docker Compose YAML we use, you'll see that the variable names are manipulated when passed to he various containers as such:

image

I'm not sure if you're doing the equivalent in AKS or not....perhaps something to check?

@cpareek
Copy link
Author

cpareek commented Feb 5, 2024

Hello @DCMattyG - I did have all the environment variable for ipam-engine but did not have any for the ipam-ui which I have just setup. That still did not help and its throwing same errors on the ipam-engine..
Really like to get this working :)

@DCMattyG
Copy link
Contributor

DCMattyG commented Feb 5, 2024

Hey @cpareek, while I do understand that you are anxious to get this solution working in AKS, I think it's important to note that we aren't specifically AKS (or Kubernetes) experts per-se. This service is comprised of a handful of containers and there are A LOT of different ways to run containers at the end of the day.

As long as the appropriate Environment Variables are configured and the containers can reach all of the required services, that should be all that is needed for this solution to function.

I'm more than happy to arrange a Teams meeting for us and I can review your setup to the best of my ability. I'm guessing there perhaps a setting missing or something similar that is causing this error. Please feel free to send me an email at Matthew.Garrett@microsoft.com and we can find a time that works for your time zone.

Hopefully a second set of eyes is all you'll need here, and for that I'm happy to be of assistance.

@cpareek
Copy link
Author

cpareek commented Feb 6, 2024

Hi @DCMattyG - Thank you for your response. Sure - lets have a look together. Will email you on this. Many thanks

@cpareek
Copy link
Author

cpareek commented Feb 12, 2024

This issue has been resolved now. Actually above error messages are irrelevant, the actual problem was that our Environment variable was named incorrectly which has been kindly spotted by @DCMattyG on Teams call.

our IPAM aks deployment is public so you should be able to see all the resources and documentation here incase you need it. Enjoy!
https://github.com/hmcts/sds-flux-config/tree/master/apps/ipam

@DCMattyG
Copy link
Contributor

Wonderful news @cpareek and thank you so much for sharing your configuration so others can leverage that in the future.

It was wonderful chatting with you, and I hope you're loving the Azure IPAM project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants