Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test_cli_program_extend_program #1115

Merged
merged 2 commits into from
Apr 30, 2024

Conversation

CriesofCarrots
Copy link

@CriesofCarrots CriesofCarrots commented Apr 29, 2024

Problem

test_cli_program_extend_program() expects a program account to be too small for a specific program, but #791 extends during deploy by default.
The api is confusing because of the double negatives, eg no_extend: false, !no_extend

Summary of Changes

Set no_extend: true throughout test_cli_program_extend_program()

I'm okay with using the --no-extend flag, but I would recommend we rewrite the fields and logic to remove the double negatives.

I am not sure why the test passes sometimes. I printed out the length of the account after the command at L1091 and sometimes it's 4820 (the one-byte-too-small size) and sometimes it's 4821. With the change in this PR, it's always 4820.
Possibly the extend transaction hadn't actually completed by the time this was called?

agave/cli/src/program.rs

Lines 2548 to 2550 in 5b3390b

if let Some(program_data_account) = rpc_client
.get_account_with_commitment(&program_data_address, config.commitment)?
.value

Commitment seems consistent across all commands and client, though.
The deploy command must be failing for some other reason in the cases the test passes.

Fixes #1114

Copy link

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, thanks for catching! I saw this CI failure a couple times and meant to check on it myself as well.

As for the confusing API, what do you think about changing the flag to no longer require a value, and simply be a flag-only option, like --no-auto-extend? Happy to make this change myself.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.2%. Comparing base (9f10f09) to head (b6cac1b).
Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #1115     +/-   ##
=========================================
- Coverage    82.2%    82.2%   -0.1%     
=========================================
  Files         880      880             
  Lines      235631   235626      -5     
=========================================
- Hits       193718   193706     -12     
- Misses      41913    41920      +7     

@CriesofCarrots
Copy link
Author

As for the confusing API, what do you think about changing the flag to no longer require a value, and simply be a flag-only option, like --no-auto-extend? Happy to make this change myself.

I think it's already a flag-only option?

agave/cli/src/program.rs

Lines 278 to 283 in 1cdaa6d

.arg(
Arg::with_name("no_extend")
.long("no-extend")
.takes_value(false)
.help("Don't automatically extend the program's data account size"),
),

--no-auto-extend is probably slightly better than the existing --no-extend, but the main change I'd prefer is to convert the no_extend variable/field to something like auto_extend everywhere, and then flop the logic here:

if !no_extend {

Wdyt @buffalojoec ?

@CriesofCarrots CriesofCarrots merged commit 680946c into anza-xyz:master Apr 30, 2024
38 checks passed
@buffalojoec
Copy link

I think it's already a flag-only option?

agave/cli/src/program.rs

Lines 278 to 283 in 1cdaa6d

.arg(
Arg::with_name("no_extend")
.long("no-extend")
.takes_value(false)
.help("Don't automatically extend the program's data account size"),
),

Ahh nuts, you're right it is. 😁

--no-auto-extend is probably slightly better than the existing --no-extend, but the main change I'd prefer is to convert the no_extend variable/field to something like auto_extend everywhere, and then flop the logic here:

if !no_extend {

Wdyt @buffalojoec ?

This was actually what I originally sought to avoid in the PR that introduced the change. The original implementation was doing what you suggested.

My thinking was that most people would run into the transaction error and then turn around and pass the flag, so I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

@CriesofCarrots
Copy link
Author

CriesofCarrots commented Apr 30, 2024

I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

Yes, I'm not suggesting changing the behavior for cli users. I'm suggesting flopping the internal logic so it's less of a foot-gun for core contributors.

I'll just PR it.

@ilya-bobyr
Copy link

I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

Yes, I'm not suggesting changing the behavior for cli users. I'm suggesting flopping the internal logic so it's less of a foot-gun for core contributors.

I'll just PR it.

I was also looking at this test, and the double negation is pretty much everywhere and seems unnecessary.
I think, the CLI argument should be called extend or something like that. With true being the default. In which case, the users can use --no-extend to change it to false.
Internally, the name of the configuration parameter would match the name of the CLI argument (unlike what it is right now), and also be called extend.

@CriesofCarrots
Copy link
Author

@ilya-bobyr #1119
Feel free to make more better, though

@ilya-bobyr
Copy link

I am not sure why the test passes sometimes. I printed out the length of the account after the command at L1091 and sometimes it's 4820 (the one-byte-too-small size) and sometimes it's 4821. With the change in this PR, it's always 4820. Possibly the extend transaction hadn't actually completed by the time this was called?

agave/cli/src/program.rs

Lines 2548 to 2550 in 5b3390b

if let Some(program_data_account) = rpc_client
.get_account_with_commitment(&program_data_address, config.commitment)?
.value

This call happens before we perform any operations, IIUC.
And the size of the program is extended only as a result of the logic below it.
So it seems a bit strange if this call would be somehow related to the issue.

Commitment seems consistent across all commands and client, though.
The deploy command must be failing for some other reason in the cases the test passes.

It is interesting that the length of the program is different, though...
I was also trying to debug this on Friday as well.
What I noticed is that if I enable the auto extend, the test passes with an error that is different from the error that it was originally seeing here:

process_command(&config).unwrap_err();

The trick is that because the test is not precise enough, any error causes it to pass.

If I change the test code to auto_extend: false, the error that causes the test to pass is this:

Err("Deploying program failed: RPC response error -32002: Transaction simulation failed: Error processing Instruction 0: account data too small for instruction [3 log messages]")

It seems like it comes from here:

agave/cli/src/program.rs

Lines 2896 to 2916 in 5b3390b

if let Some(mut message) = final_message {
if let Some(final_signers) = final_signers {
trace!("Deploying program");
simulate_and_update_compute_unit_limit(&rpc_client, &mut message)?;
let mut final_tx = Transaction::new_unsigned(message);
let blockhash = rpc_client.get_latest_blockhash()?;
let mut signers = final_signers.to_vec();
signers.push(fee_payer_signer);
final_tx.try_sign(&signers, blockhash)?;
return Ok(Some(
rpc_client
.send_and_confirm_transaction_with_spinner_and_config(
&final_tx,
config.commitment,
RpcSendTransactionConfig {
preflight_commitment: Some(config.commitment.commitment),
..RpcSendTransactionConfig::default()
},
)
.map_err(|e| format!("Deploying program failed: {e}"))?,

Are we somehow ignoring errors during the buffer update and just fail when we attempt to finalize the program?

I would expect it to fail here instead:

agave/cli/src/program.rs

Lines 2880 to 2891 in 5b3390b

.map_err(|err| format!("Data writes to account failed: {err}"))?
.into_iter()
.flatten()
.collect::<Vec<_>>();
if !transaction_errors.is_empty() {
for transaction_error in &transaction_errors {
error!("{:?}", transaction_error);
}
return Err(
format!("{} write transactions failed", transaction_errors.len()).into(),
);

But when the test succeeds with auto_extend: true, the error is different:

Err("Account allocation failed: RPC response error -32002: Transaction simulation failed: Error processing Instruction 2: invalid program argument [7 log messages]")

It originates here:

agave/cli/src/program.rs

Lines 2807 to 2808 in 5b3390b

log_instruction_custom_error::<SystemError>(result, config)
.map_err(|err| format!("Account allocation failed: {err}"))?;

It seems like it is the buffer extend operation itself that fails?
I am not really sure.
This is where I got stuck on Friday and didn't get back to it until now.

@ilya-bobyr
Copy link

Separately, I would like to point out that this test has two issues:

  1. The test is too long, like 4 tests combined.
    And the state from the previous "test" is reused in the next one.
    This makes it needlessly harder to debug.
    Should have been 4 independent tests. And they can also run in parallel.

  2. The error check is not precise enough.

@t-nelson
Copy link

t-nelson commented May 1, 2024

the default is to spend the user's money, rather than have them opt-in to it?

@ilya-bobyr
Copy link

ilya-bobyr commented May 1, 2024

the default is to spend the user's money, rather than have them opt-in to it?

Good question. There is some reasoning here: #791 (comment)

Developers attempting to calculate this amount over and over again have become frustrated and some have just spent the money to allocate way more than they should. Some even as much as 10mb. This is bad both for the cluster and developer experience.

Looking at the rent except calculation here:

pub fn minimum_balance(&self, data_len: usize) -> u64 {
let bytes = data_len as u64;
(((ACCOUNT_STORAGE_OVERHEAD + bytes) * self.lamports_per_byte_year) as f64
* self.exemption_threshold) as u64
}

I get:

ACCOUNT_STORAGE_OVERHEAD = 128
bytes = 10 * 1024 * 1024
lamports_per_byte_year = 3480
exemption_threshold = 2.0

(ACCOUNT_STORAGE_OVERHEAD + bytes) * lamports_per_byte_year * self.exemption_threshold

(128 + 10_485_760) * 3480 * 2 = 72_981_780_480

73 SOL is about $9k in today prices.

So I am not sure people would really allocate 10MBs just to save some computation. Even at $20, it is $1,460.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignored test_cli_program_extend_program
5 participants