Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anchor Events Are Extremely Compute Intensive When Used Without Maximal Compiler Optimizations #863

Closed
bonedaddy opened this issue Oct 10, 2021 · 20 comments · Fixed by #1807
Labels
help wanted Extra attention is needed lang ts
Projects

Comments

@bonedaddy
Copy link
Contributor

bonedaddy commented Oct 10, 2021

Overview

With the default compiler optimizations, logging the same information through emit! is approximately 2x more expensive in compute units, than it is to directly use msg!.

When using maximal compiler optimizations as follows, the cost increase is drastically reduced, but is still more expensive.

lto = "fat"
codegen-units = 1
[profile.release.build-override]
opt-level = 3
incremental = false
codegen-units = 1

While this can solve compute unit usage issues within your own programs, it still leaves programs at risk of reaching compute unit limits when making CPI to third-party programs that emit events, without these compiler optimizations.

As it stands this is more of an edge-case issue, however if the current event system is kept in place I think once more programs are deployed onto Solana, and there is more interconnectivity between programs, this will probably start to become a real issue.

Proposed Solution - Avoid Borsh Serialization And Base64 Encoding Completely

Personally speaking I think the best option is to remove serialization and encoding of events, and use proc macros to format the event into a JSON string which is what gets logged. This has the added bonus of being human readable on block explorers.

Alternative Solutions

  • update anchor docs to recommend using the aforementioned compiler optimizations when using anchor events
  • when running anchor init add the aforementioned compiler optimizations to the default workspace file.

Optimize The Process Of Emitting Events

Not exactly sure what this would look like, but there's probably something that can be done. At a quick glance, changing this function to preallocate the vector entirely would help, but im not sure if this is possible.

Add Compiler Optimizations To Default Workspace File Generated By Anchor

@armaniferrante
Copy link
Member

Have you measured the difference in compute units between JSON vs base64(borsh)?

@Henry-E
Copy link
Contributor

Henry-E commented Oct 10, 2021

Mango said Anchor events reduced their compute units for logging from 30k units to 4k. What kind of compute unit usage are you seeing?

@bonedaddy
Copy link
Contributor Author

bonedaddy commented Oct 10, 2021

Have you measured the difference in compute units between JSON vs base64(borsh)?

Ah perhaps I should clarify, regardless of the serialization method being used, any sort of serialization will probably end up more or less costing the same amount of compute units. I'm not super familiar with proc macros, but I think a proc macro which generates a formatted message that can be used would be cheaper.

Mango said Anchor events reduced their compute units for logging from 30k units to 4k. What kind of compute unit usage are you seeing?

From a quick look through the two different branches which use traditional msg! invocation, vs using anchor events, it seems like less information is being included in the anchor events, which would probably explain the reduced compute units.

Examples

Here's an extremely quick PoC i put together, as per the program logs anchor events consume more than twice the amount of compute units compared to just logging a JSON string with fromatting:

Streaming transaction logs mentioning Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS. Confirmed commitment
Transaction executed in slot 1:
  Signature: 3a8nZ82HxiCmw8i9Sh2rN73UXPXXUfZbisgd6jjMBNcJNSRGFuEqzRntHACRoUbxB4cxtCvMWYYB3NSHqoyTqdnZ
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: pev3G58vvc4GAAAAZm9vYmFyewAAAAAAAAA=
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 2684 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success
Transaction executed in slot 2:
  Signature: h8qk7juRpRcYefw4B5fKsKtFrL9AfDEQNSoNAFwX4KtiEXh1rWQGBPkRBwJbnoUbPqJkmj6rW1eFq369jvj8igK
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: {"message": "foobar", "number": 123}
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 1214 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success

Heres the same PoC but with more data being logged, and there is still approximately twice the compute units being used:

Streaming transaction logs mentioning Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS. Confirmed commitment
Transaction executed in slot 1:
  Signature: 5QNVvaZpkk1xXHazr2K4MJ8SS6ezvBmeArvaPY9DHpX4yRMXzz28K7986nLViY1DWxtgtzToBfaFg2i5hjFMtRAX
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: pev3G58vvc4GAAAAZm9vYmFyewAAAAAAAAADAAAAYmF6AwAAAGJhcgEBAAAAYQEAAABi
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 4769 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success
Transaction executed in slot 2:
  Signature: 5uKfhp57JUzVzTPosSZwT6uaAZeuQpsgZfdSEhwbuRAVXayqiEQzF38Cra1vybh9z9MugNZ1SJXbWa6zJVNx9bXa
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: {"message": "foobar", "number": 123, "such": "baz", "compute_units": "bar", "very_wow": "true", "a": "a", "b": "b"}
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 2309 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success

Applying the following compile optimizations drastically reduced the discrepancy, however this doesn't really solve the issue because it means assuming everyone will deploy their anchor programs with these optimizations added.

lto = "fat"
codegen-units = 1
[profile.release.build-override]
opt-level = 3
incremental = false
codegen-units = 1

gives the following compute units usage:

Streaming transaction logs mentioning Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS. Confirmed commitment
Transaction executed in slot 1:
  Signature: 3L68jUnKw7XZaggryA8UfH9nXBwexXsJtQgj2XCBuSGzG6crsuFxWtgQE6dWdhTedTdBx825sv3nKkvZ9akMAZ8e
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: pev3G58vvc4GAAAAZm9vYmFyewAAAAAAAAADAAAAYmF6AwAAAGJhcgEBAAAAYQEAAABi
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 2136 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success
Transaction executed in slot 2:
  Signature: 5izscJ9wTFnup8gxYyVponLVpgRuGhvEGfwbbaSa3KDmBMz5ZJ1ojH3jaSN3C1gXKDttCtWwrGecmTU2ZW9EGpqW
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: {"message": "foobar", "number": 123, "such": "baz", "compute_units": "bar", "very_wow": "true", "a": "a", "b": "b"}
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 1976 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success

@armaniferrante
Copy link
Member

any sort of serialization will probably end up more or less costing the same amount of compute units.

Can you clarify the original issue in this context and the suggested fix? Programs don't need to use the emit! macro if they find it too costly, and if there's a way to make it more efficient, we should do it.

@bonedaddy
Copy link
Contributor Author

bonedaddy commented Oct 10, 2021

Can you clarify the original issue in this context and the suggested fix?

Sure i'll update the original issue.

Programs don't need to use the emit! macro if they find it too costly, and if there's a way to make it more efficient, we should do it.

That's personally what we've done with our programs for SolFarm, but that only really puts a bandaid on the issue, and only truly works as a solution if your program never CPI's to other programs, or just never CPI's to other programs that use the emit! macro.

@bonedaddy bonedaddy changed the title Anchor Events Are Extremely Compute Intensive Anchor Events Are Extremely Compute Intensive When Used Without Maximal Compiler Optimizations Oct 10, 2021
@Henry-E
Copy link
Contributor

Henry-E commented Oct 11, 2021 via email

@fanatid
Copy link
Contributor

fanatid commented Oct 11, 2021

@bonedaddy can you measure instructions count with this PR? #870

@bonedaddy
Copy link
Contributor Author

Based on your program logs you're roughly saying that emitting the
serialized data is 2k units whereas msg! logging a json string (pre
formatted) is only 1k units.

To clarify this is with compiler optimizations applied, which isn't always going to be the case. Without compiler optimizations its 4k not 2k

Though honestly, 1k units vs. 2k units might seem large percentage wise but
in absolute terms it's more insignificant.

Right it's negligible in this context where the amount of data being emitted is tiny, and there's nothing else going on. But once you start interacting with other programs via cpi, emitting multiple events, and getting into complex DeFi protocols it's not negligible at all.

Up until recently the program's we've integrated with have emitted a small number of events that the compute units used by events hasn't been problematic. However when Saber migrated their farms to Quarry which emits a number of events, the compute units consumed by events started becoming problematic.

When harvesting rewards from Quarry, 3 events are emitted:

1:

0Q/jA6yPVSDIdBbhXCLvltw50ZnjDx2hzw74NVn49kmpYj2VZHQJoeJoYJqaKcvuxCi/2i4yywedcVNDWkM84Iuw+cEn9/RO9XIsCgAAAABekW0CBY6JdAnNHugdhAYIn3bc02X5Bd3Knr5atolzZA==

2:

0Q/jA6yPVSDIdBbhXCLvltw50ZnjDx2hzw74NVn49kmpYj2VZHQJoeJoYJqaKcvuxCi/2i4yywedcVNDWkM84Iuw+cEn9/ROrkIAAAAAAAAzLlBGju5fSXlMaaQr6xpNwj+ObGjudYJMFWBICs4aSg==

3

XQ9GqjCM1NvQRUnAelAgZHG7mfT0auJVqc42SliyLJ5mi/zqETpgnAkWWUpkXF5787vWAVtrlruEZGZ1Wdppm7aySu3iC1+JCrXY4qBFI9dveEERQ1c4kdU46xjxj9Vi+QXkb2Kx45T1ciwKAAAAAK5CAAAAAAAAquNjYQAAAAA=

These logs include roughly 2x the amount of characters as the log from my PoC. For arguments sake, lets say this implies a 2x increase in the amount of compute units.

Quarry doesn't include the compiler optimizations so that means 1 event consumes 8k compute units, for a total of 24k compute units consumed across all three events, for a total consumption of 12% of the available compute units. If Quarry were to be built with the compiler optimizations, it would be half the cost so 6% of the available compute units. If the events were logged as JSON, it would amount to around 3% of the compute units.

In the case of Saber's Quarry farms the token you get from harvesting is an IOU which needs to be redeemed by another program, which also emits an event of roughly equal size. Using the hypothetical estimations so far, the act of harvesting + redeeming rewards without optimizations consumes a total of 32k compute units , or 16% of the available compute units.

Hypothetical estimations aside, the amount of events emitted from the harvest and redeem steps was enough that we had to adjust our vaults to do this in two separate transactions. Even if Quarry was deployed with compiler optimizations, we would be right on the edge of compute unit consumption, and would likely still have to do this in two steps. Events emitted as JSON would allow this to happen in one transaction.

Solana DeFi is just getting started, and with Anchor becoming increasingly popular, its reasonable to assume that events are going to be used more and more. If DeFi on Solana is going to behave as it does in other ecosystems, this means many different protocols interacting with each other. So while on a small scale (ie a program being used by a webui, or in this PoC) i agree the cost savings dont matter that much. Once you start considering the broader picture, a Solana ecosystem with hundreds of dapps interacting with each other, it does matter.

At present I believe compute units are instruction wide, the easiest work around is to simply include multiple instructions when you create a transaction. However consiering there is a change coming to Solana that will apply compute units budget transaction wide, this is only a short term solution.

Also relevant questions. Are you using a lot of these calls? Would it be
possible to consolidate into less calls?

Apologies if I wasn't clear, we don't actually use events with SolFarm. The issue comes from needing to interact with other programs that use events.

@bonedaddy
Copy link
Contributor Author

@bonedaddy can you measure instructions count with this PR? #870

Yup, I'm actually about to sign off for the night so I'll do this in the morning.

@Henry-E
Copy link
Contributor

Henry-E commented Oct 11, 2021

These logs include roughly 2x the amount of characters as the log from my PoC. For arguments sake, lets say this implies a 2x increase in the amount of compute units.

Does compute cost actually increase linearly with the amount of text being printed? This isn't something I've heard of before.

Solana DeFi is just getting started, and with Anchor becoming increasingly popular, its reasonable to assume that events are going to be used more and more.

No ones arguing against you on these points, just trying to accurately gauge what the problem actually looks like in order to have a strong starting point for figuring out potential solutions. Events and good logging are super important for the whole ecosystem!

@bonedaddy
Copy link
Contributor Author

Does compute cost actually increase linearly with the amount of text being printed? This isn't something I've heard of before.

I'm not sure to honest, just used that as a guesstimate. However even if it doesn't scale linearly and its only a 25% increase, a 15% increase, or just a 10% increase, the compute units consumed from events becomes problematic the more complex your program gets. Even without complex programs, emitting one too many events is enough to cause problems (ie: the harvest + redemption example i mentioned previously).

No ones arguing against you on these points, just trying to accurately gauge what the problem actually looks like in order to have a strong starting point for figuring out potential solutions. Events and good logging are super important for the whole ecosystem!

Events are very important indeed, but the compute units currently consumed by events will not scale to more complex protocols that involve interacting with many other protocols. It's entirely possible to work around this now, but once transaction wide compute limits are implemented, it will be a blocker.

@bonedaddy
Copy link
Contributor Author

@fanatid i got the following error:

    'Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]',
    'Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 198140 of 200000 compute units',
    'Program failed to complete: invalid utf-8 sequence of 1 bytes from index 0: [165, 235, 247, 27, 159, 47, 189, 206, 112, 101, 118, 51, 71, 53, 56, 118, 118, 99, 52, 71, 65, 65, 65, 65, 90, 109, 57, 118, 89, 109, 70, 121, 101, 119, 65, 65, 65, 65, 65, 65, 65, 65, 65, 68, 65, 65, 65, 65, 89, 109, 70, 54, 65, 119, 65, 65, 65, 71, 74, 104, 99, 103, 69, 66, 65, 65, 65, 65, 89, 81, 69, 65, 65, 65, 66, 105]',
    'Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS failed: Program failed to complete'

@fanatid
Copy link
Contributor

fanatid commented Oct 11, 2021

@bonedaddy not sure why version with Vec<u8> does not work 🤔 Reverted, can you try one more time?

@bonedaddy
Copy link
Contributor Author

Unfortunately that made compute units consumption worse 😭

Compiling with optimizations:

  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: pev3G58vvc4GAAAAZm9vYmFyewAAAAAAAAADAAAAYmF6AwAAAGJhcgEBAAAAYQEAAABi
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 7952 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success
Transaction executed in slot 2:
  Signature: 63CfsfVJqWqnQCcuNK1DnnYDH3exLWYxja3zqy9VFrCTUr9upft8PLod6jmMnBq1PEj3amAsPcSh46dt9UxdQWXo
  Status: Ok
  Log Messages:
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS invoke [1]
    Program log: {"message": "foobar", "number": 123, "such": "baz", "compute_units": "bar", "very_wow": "true", "a": "a", "b": "b"}
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS consumed 1976 of 200000 compute units
    Program Fg6PaFpoGXkYsidMpWTK6W2BeZ7FEfcYkg476zPFsLnS success

@fanatid
Copy link
Contributor

fanatid commented Oct 11, 2021

Interesting. Thanks that checked!

@armaniferrante
Copy link
Member

A new syscall API was added recently, which hopefully addresses some of these concerns. See solana-labs/solana#19764.

@bonedaddy
Copy link
Contributor Author

Interesting, that looks very promising.

@armaniferrante armaniferrante added this to To do in Priorities Feb 11, 2022
@armaniferrante armaniferrante added help wanted Extra attention is needed lang ts labels Feb 11, 2022
@armaniferrante
Copy link
Member

What mango is doing to make this more efficient for memory usage: https://github.com/blockworks-foundation/mango-v3/pull/134/files

@paul-schaaf
Copy link
Contributor

@bonedaddy would you mind benchmarking again with the master branch? master now uses the sol_log_data syscall

@paul-schaaf
Copy link
Contributor

paul-schaaf commented Apr 18, 2022

made some benchmarks for anchor 0.24.2:

repo: https://github.com/paul-schaaf/emit-bench (master has the custom profile, without-profile branch does not)
results: https://gist.github.com/paul-schaaf/4742a7c474c38807918b2f7e6466b796

summary:

  • logging pubkeys in msg! is more expensive with or without a custom profile
  • without a profile, the code is more efficient now that we have sol_log_data but it's still not as cheap as the code with the custom profile
  • using the custom profile now even beats msg!

however this doesn't really solve the issue because it means assuming everyone will deploy their anchor programs with these optimizations added.

we can add the custom profile to our cli templates so that every anchor project going forward uses it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed lang ts
Projects
Development

Successfully merging a pull request may close this issue.

5 participants