Skip to content

Conversation

@galiacheng
Copy link
Contributor

@galiacheng galiacheng commented Jul 8, 2021

  • nobrl: Do not send byte range lock requests to the server. Otherwise, when restarting the domain that is persisted to pv, error message could be like following:
<Jul 8, 2021 1:44:24,378 AM GMT> <Error> <Store> <BEA-280061> <The store "_WLS_admin-server" could not be deployed: weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the storage.
weblogic.store.PersistentStoreFatalException: [Store:280020]There was an error while reading from the storage.
        at weblogic.store.internal.PersistentStoreImpl.open(PersistentStoreImpl.java:571)
        at weblogic.store.admin.AdminHandler.activate(AdminHandler.java:159)
        at weblogic.store.admin.FileAdminHandler.activate(FileAdminHandler.java:230)
        at weblogic.store.admin.DefaultStoreService.start(DefaultStoreService.java:103)
        at weblogic.server.AbstractServerService.postConstruct(AbstractServerService.java:76)
        Truncated. see log file for complete stacktrace
Caused By: java.io.IOException: Error reading from file, Permission denied, errno=13
        at weblogic.store.io.file.direct.DirectIONative.read(Native Method)
        at weblogic.store.io.file.direct.DirectIONativeImpl.read(DirectIONativeImpl.java:126)
        at weblogic.store.io.file.direct.DirectFileChannel.read(DirectFileChannel.java:241)
        at weblogic.store.io.file.StoreFile.readBulk(StoreFile.java:337)
        at weblogic.store.io.file.Heap.readStoreFile(Heap.java:1453)
        Truncated. see log file for complete stacktrace
>

@galiacheng
Copy link
Contributor Author

Hello @edburns @mriccell @rjeberhard could you please review the PR, thank you!

Copy link

@tbarnes-us tbarnes-us left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will either need to:

  • Document the risks of nobrl. Off the top of my head, I think this is only suitable if (a) no critical information such as JMS messages, JTA records, or timer records are stored in file stores -OR- in strictly controlled and well understood circumstances, service migration is disabled for file stores, and (b) no third party product/system/app-code depends on byte range locks.
  • Or switch to a file system that supports locking.

Discussion continues in slack.

@galiacheng
Copy link
Contributor Author

Hello @tbarnes-us thanks for your comments.

nobrl flag is used in the official Azure file share example .

It is also mentioned as useful options in the AKS official troubleshooting guideline.

I also add notes in the document to draw customer's attention, in case they have any specific requirements that stop using the option.

@edburns
Copy link
Contributor

edburns commented Jul 12, 2021

locking.txt file from @tbarnes-us .
locking.txt

@edburns
Copy link
Contributor

edburns commented Jul 12, 2021

Hello @galiacheng , if you grant me push access to your fork, I can push this commit. It contains a rewording of the helpful content from @tbarnes-us . Otherwise, you can apply this patch.
0001-On-branch-galiacheng-main-Reword-helpful-content-fro-patch.txt

Signed-off-by: Ed Burns <edburns@microsoft.com>
 Changes to be committed:
	modified:   create-aks-cluster-storage.txt
Copy link

@tbarnes-us tbarnes-us left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving - the changes look good to me overall.

Minor suggestion: please consider adding #additional-important-file-locking-information and #mitigating-corruption-risk-when-locking-is-disabled to the table of contents at the top of the doc.

@rosemarymarano (doc lead) ideally should also review.

Copy link
Contributor

@rosemarymarano rosemarymarano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor edits

- nobrl
```

**Note:** This sample includes `nobrl` in the `mountOptions` to disable byte range file locking on the `azurefile` storage class. This is necessary as of this writing because the `azurefile` storage class does not support advisory byte range locking. This approach is documented in the [Azure Kubernetes Service FAQ](https://docs.microsoft.com/en-us/azure/aks/troubleshooting#what-are-the-default-mountoptions-when-using-azure-files).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary as of this writing because -> Currently, this is necessary because

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9e2dc7c.


Here are several different approaches to disable file locking.

- When using the `azurefile` storage class, you can universally disable locking on the entire file system by enabling the `nobrl` mount option, as shown above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as shown above. -> as shown previously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9e2dc7c.

- When using the operator, you can provide this configuration without needing to modify your original configuration using [configuration overrides](https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/configoverrides/) for Domain on PV or Domain in Image, or [runtime updates](https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/model-in-image/runtime-updates/) for Model in Image
- Note that this can be a substantial amount of work and error prone as it requires configuration updates for each individual default store, custom file store, and JMS paging store.

- You can disable all file store locks on a particular WebLogic server JVM by _both_ applying patch `32471832` and setting `-Dweblogic.store.file.LockEnabled=false`. When using the operator, you can set command line values using the `JAVA_OPTIONS` env var in `spec.serverPod.env` domain resource attribute. This will work for any operator domain home source type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebLogic server-> WebLogic Server (globally)
command line values -> command-line values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9e2dc7c.


- File store service migration is not supported when file locking is disabled because it relies on file locks for safe behavior.

- Service migration is a WebLogic high availability option that is typically configured to enable data recovery on surviving WebLogic servers in a cluster upon an unexpected WebLogic server failure. It is also used to enable JMS and JTA data recovery from WebLogic servers that are shutdown due to a cluster shrink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebLogic servers -> WebLogic Servers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9e2dc7c.

modified:   documentation/staging/content/samples/simple/azure-kubernetes-service/includes/create-aks-cluster-storage.txt

- Apply suggestions from @rosemarymarano.
@edburns
Copy link
Contributor

edburns commented Jul 14, 2021

Hello @tbarnes-us I did consider your suggestion about the TOC link, but because of the inclusion mechanism it would require introducing a dependency between the outer domain-on-pv.md. Because this content will ultimately reside in the FAQ, per #2453 I judged it best not to add the toc links.

@rjeberhard rjeberhard merged commit 3c42a75 into oracle:main Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants