New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for ARM shapes on Oracle OCI #16277
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Images supporting Ubuntu 22.04 with amd64 are getting lost in the processing somewhere, not sure where: Image ocid1.image.oc1.phx.aaaaaaaapj3tk2to46hrihgab6aytk3xfv5tcdbdms4k2ixzoz64zxlpmjfq
with String Canonical-Ubuntu-22.04-aarch64-2023.08.23
is being parsed as Canonical Ubuntu 22.04
per MM yesterday
provider/oci/images.go
Outdated
// Specifically, for ubuntu, we need to create a different base when the image | ||
// name contains "aarch64", which are the images that target ARM-based shapes. | ||
func NewInstanceImage(img ociCore.Image, compartmentID *string) (InstanceImage, error) { | ||
logger.Warningf("*** CREATING INSTANCE IMAGE FROM '%s'//'%s'", *img.OperatingSystem, *img.OperatingSystemVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this a trace or debug, it's noise as a warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I had forgotten to clean up after debugging, done
// This code is very brittle. It's highly dependent on the strings currently | ||
// returned by the oracle sdk. The best option is to have | ||
// PlatformConfigOptions as they have types we can rely on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a warning like this for developers somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we are no longer using PlatformConfigOptions
, what comment should we keep from that?
provider/oci/images.go
Outdated
UbuntuMinimalBase = "ubuntu-minimal" | ||
UbuntuARMBase = "ubuntu-" + arch.ARM64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not a valid OS in a Base in juju proper, but an Oracle implementation detail. If they leak out of the provider into state, juju won't be able to handle them appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic, this is gone now
provider/oci/images.go
Outdated
UbuntuPostfix UbuntuVersionPostfix = "" | ||
UbuntuMinimalPostfix UbuntuVersionPostfix = "Minimal" | ||
UbuntuAARCH64Postfix UbuntuVersionPostfix = "aarch64" | ||
UbuntuMinimalAARCH64Postfix UbuntuVersionPostfix = "Minimal aarch64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UbuntuMinimalAARCH64Postfix is unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned it up, done
provider/oci/images.go
Outdated
@@ -116,6 +127,21 @@ func (i *InstanceImage) SetInstanceTypes(types []instances.InstanceType) { | |||
i.InstanceTypes = types | |||
} | |||
|
|||
// UniqueArchs returns the list of unique supported architectures by the | |||
// InstanceImage. This is necessary because an InstanceImage can support | |||
// multiple shapes, and therefore supporting multiple architectures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and therefore supporting multiple architectures." is incorrect for oracle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, I removed it thanks
provider/oci/images.go
Outdated
uniqueVals := make(map[string]bool) | ||
var archs []string | ||
for _, val := range i.InstanceTypes { | ||
if !uniqueVals[val.Arch] { | ||
archs = append(archs, val.Arch) | ||
} | ||
uniqueVals[val.Arch] = true | ||
} | ||
return archs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set.Strings from github.com/juju/collections does this for you, nor does it allow you to have duplicate strings.
uniqueVals := make(map[string]bool) | |
var archs []string | |
for _, val := range i.InstanceTypes { | |
if !uniqueVals[val.Arch] { | |
archs = append(archs, val.Arch) | |
} | |
uniqueVals[val.Arch] = true | |
} | |
return archs | |
archs := set.NewStrings() | |
for _, val := range i.InstanceTypes { | |
arches.Add(val.Arch) | |
} | |
return archs.Values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice! done
provider/oci/environ.go
Outdated
// Since the images for ARM machine will include "aarch64" on their | ||
// name, we need to make sure we select the correct base that was | ||
// taken from the parsed image name. | ||
if arch == corearch.ARM64 { | ||
args.InstanceConfig.Base.OS = UbuntuARMBase | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns about overloading the defined juju base.OS values with architecture details and the values ending up in state. Nor am I clear how this interacts with the architecture constraints.
What about updating imgCache.SupportedShapes to take an architecture value as well. The actual imgCache is local to the Oracle provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, this was a very bad idea. I changed the implementation, now we pass the architecture as parameter to the image functions. Also, we keep track of the image's architecture on the cache so we can select them using the passed arch parameter.
// normaliseArchAndInstType returns the Juju architecture and instance type | ||
// corresponding to a shape's reported architecture and instance type based | ||
// off ShapePlatformConfigOptionsTypeEnum. | ||
func normaliseArchAndInstType(val ociCore.ShapePlatformConfigOptionsTypeEnum) (string, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still need to handle the InstanceTypes here, VM, BM etc for use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is done in instTypeByShapeName()
which I didn't touch
a9e0722
to
8db2897
Compare
@nvinuesa once a review has started, please add new commits with changes rather than force pushing an amended commit (excepting rebases). It makes it more difficult for reviewers. They can be squashed before merge. |
This PR is blocked while @nvinuesa investigates improved image and instance modeling for juju where Flexible instances exist in a cloud. |
#16357 In the context of adding ARM support for Oracle OCI (#16277), we must fist support oracle flexible shapes because the only ARM shape on OCI is a flexible one (https://docs.oracle.com/en-us/iaas/Content/Compute/References/computeshapes.htm#flexible). In order to support flexible (https://docs.oracle.com/en-us/iaas/Content/Compute/References/computeshapes.htm#flexible) shapes we must pass some configuration `ShapeConfig` (at least the number of cpu-cores) at the moment of launching the instance. This configuration value will be either the user-passed constraint(s) cpu-cores and memory, else a default minimum cpu-cores. We also define a new constant for minimum cpu-cores that is matched on `instances.MatchingInstanceTypes`. Also, we cannot pass this `ShapeConfig` value if the selected shape is not flexible so we have to be able to distinct between flexible and non-flexible shapes. This was done by adding `MaxCpuCores` and `MaxMem` on `InstanceType` _Bonus: some cleanup on tests._ ## Checklist *If an item is not applicable, use `~strikethrough~`.* - [X] Code style: imports ordered, good names, simple structure, etc - [X] Comments saying why design decisions were made - [X] Go unit tests, with comments saying what you're testing - [ ] ~[Integration tests](https://github.com/juju/juju/tree/main/tests), with comments saying what you're testing~ - [ ] ~[doc.go](https://discourse.charmhub.io/t/readme-in-packages/451) added or updated in changed packages~ ## QA steps ``` juju bootstrap --debug --config compartment-id=ocid1.compartment.oc1..aaaaaaaanvu2racnlczevenu73dlcf3nokgsjpdkbdgp4xbrz3lb2y4giyjq oci-canonical c ``` check OCIs console and make sure the `VM.Standard1.1` shape was selected as the controller running instance. ## Links **Jira card:** JUJU-[4694]
@nvinuesa with the qa instructions, I fail to bootstrap with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need more time to review. The parsing is complex. The qa did not work as written.
provider/oci/images.go
Outdated
VirtType: string(val.ImageType), | ||
// Iterate over the list of supported architectures of the | ||
// image. | ||
for _, arch := range val.Archs() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to iterate over val.Arches(), if val can only be 1 arch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the issue is that we cannot enforce that. Since the instanceTypes
are created from the returned list of shapes from the API, we can't be sure that it will always be the case. You suggest that we throw an error if len(val.Archs()) > 1
?
provider/oci/images.go
Outdated
// struct returned by oci's API. Also, it returns the architecture that is | ||
// detected from the image OperatingSystemVersion and it's name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2nd sentence doesn't match the function return values.
s/it's/its/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just rephrased it
provider/oci/images.go
Outdated
// we need to split the provided OperatingSystemVersion, since | ||
// it can contain information that does not belong to the base's | ||
// channel. | ||
// E.g: Canonical-Ubuntu-22.04-Minimal-aarch64-2023.08.27-0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More examples of possible OperatingSystemVersion values would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a second example, with more detailed explanation
provider/oci/images.go
Outdated
// becomes: | ||
// OperatingSystem: Canonical Ubuntu | ||
// OperatingSystemVersion: 22.04 Minimal aarch64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from older code, and should not be valid with your fixes. Please validate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rephrased this comment
@hmlanigan regarding the QA, I suspect you already had something in
please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one took a while. It was worth it to improve how juju models instance selections for newer types of cloud machines.
By correctly parsing the version of the ubuntu cloud images returned by OCI's API, we can then create different bases for ubuntu aarch64 (the cloud image needed to target arm-based shapes). Bonus: fixed a test that had incorrect logic.
Since OCI image parsing is becoming complex, more examples were needed.
…res don't match the image's architecture
/merge |
#16396 Merges: - #16368 - #16357 - #16135 - #16391 - #16277 - #16296 Conflicts: - deleted by us: api/state_test.go - both modified: apiserver/admin_test.go - both modified: apiserver/facades/client/client/client.go - both modified: apiserver/facades/client/client/status.go - both modified: apiserver/facades/client/client/status_test.go - both modified: apiserver/facades/client/storage/storage.go - both modified: cmd/juju/ssh/ssh_machine.go - both modified: cmd/juju/ssh/ssh_machine_test.go - both modified: cmd/juju/status/export_test.go - both modified: cmd/juju/status/status.go - both modified: cmd/juju/status/status_internal_test.go - both modified: cmd/juju/status/status_test.go - deleted by them: core/charm/computedseries.go - both modified: state/application.go - both modified: state/application_test.go - both modified: state/state.go - both added: testcharms/charm-repo/bionic/mysql/manifest.yaml All conflicts originating from: - #16135 - #16296
By correctly parsing the version of the ubuntu cloud images returned by OCI's API, we can then create different bases for ubuntu aarch64 (the cloud image needed to target arm-based shapes).
It is important to note that we will not be supporting
Minimal aarch64
ubuntu images, based on what is documented for using ARM-based shapes: https://docs.oracle.com/en-us/iaas/Content/Compute/References/images.htmBonus: By moving the retrieve of
instanceTypes()
after we correctly parsed theInstanceImage
, we avoid useless calls to the API to retrieve the shapes for images that we don't support anyways.Bonus 2: Fixed the Makefile for building cross-compiled versions (thx to @hmlanigan and @tlm).
Bonus 3: Fixed the
TestRefreshImageCacheWithInvalidImage
test which had incorrect logic.Checklist
Integration tests, with comments saying what you're testingdoc.go added or updated in changed packagesQA steps
If you log in the OCI dashboard, you should see 3
VM.Standard.A1.Flex
instances (one for the controller, two for the added machines) with 1 Ocpu for two of them and 3 Ocpus/ 11GB for the third one.Links
Jira card: JUJU-4514