Fix test_cli_program_extend_program #1115

CriesofCarrots · 2024-04-29T22:55:19Z

Problem

test_cli_program_extend_program() expects a program account to be too small for a specific program, but #791 extends during deploy by default.
The api is confusing because of the double negatives, eg no_extend: false, !no_extend

Summary of Changes

Set no_extend: true throughout test_cli_program_extend_program()

I'm okay with using the --no-extend flag, but I would recommend we rewrite the fields and logic to remove the double negatives.

I am not sure why the test passes sometimes. I printed out the length of the account after the command at L1091 and sometimes it's 4820 (the one-byte-too-small size) and sometimes it's 4821. With the change in this PR, it's always 4820.
Possibly the extend transaction hadn't actually completed by the time this was called?

agave/cli/src/program.rs

Lines 2548 to 2550 in 5b3390b

    
           if let Some(program_data_account) = rpc_client 
        
               .get_account_with_commitment(&program_data_address, config.commitment)? 
        
               .value

Commitment seems consistent across all commands and client, though.
The deploy command must be failing for some other reason in the cases the test passes.

Fixes #1114

…t to

buffalojoec

Lgtm, thanks for catching! I saw this CI failure a couple times and meant to check on it myself as well.

As for the confusing API, what do you think about changing the flag to no longer require a value, and simply be a flag-only option, like --no-auto-extend? Happy to make this change myself.

codecov-commenter · 2024-04-30T00:29:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.2%. Comparing base (9f10f09) to head (b6cac1b).
Report is 6 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##           master    #1115     +/-   ##
=========================================
- Coverage    82.2%    82.2%   -0.1%     
=========================================
  Files         880      880             
  Lines      235631   235626      -5     
=========================================
- Hits       193718   193706     -12     
- Misses      41913    41920      +7

CriesofCarrots · 2024-04-30T01:25:54Z

As for the confusing API, what do you think about changing the flag to no longer require a value, and simply be a flag-only option, like --no-auto-extend? Happy to make this change myself.

I think it's already a flag-only option?

agave/cli/src/program.rs

Lines 278 to 283 in 1cdaa6d

    
           .arg( 
        
               Arg::with_name("no_extend") 
        
                   .long("no-extend") 
        
                   .takes_value(false) 
        
                   .help("Don't automatically extend the program's data account size"), 
        
           ),

--no-auto-extend is probably slightly better than the existing --no-extend, but the main change I'd prefer is to convert the no_extend variable/field to something like auto_extend everywhere, and then flop the logic here:

agave/cli/src/program.rs

Line 2544 in 1cdaa6d

if !no_extend {

Wdyt @buffalojoec ?

buffalojoec · 2024-04-30T01:29:59Z

I think it's already a flag-only option?

agave/cli/src/program.rs

Lines 278 to 283 in 1cdaa6d

.arg(

Arg::with_name("no_extend")

.long("no-extend")

.takes_value(false)

.help("Don't automatically extend the program's data account size"),

),

Ahh nuts, you're right it is. 😁

--no-auto-extend is probably slightly better than the existing --no-extend, but the main change I'd prefer is to convert the no_extend variable/field to something like auto_extend everywhere, and then flop the logic here:

agave/cli/src/program.rs

Line 2544 in 1cdaa6d

if !no_extend {

Wdyt @buffalojoec ?

This was actually what I originally sought to avoid in the PR that introduced the change. The original implementation was doing what you suggested.

My thinking was that most people would run into the transaction error and then turn around and pass the flag, so I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

CriesofCarrots · 2024-04-30T01:32:29Z

I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

Yes, I'm not suggesting changing the behavior for cli users. I'm suggesting flopping the internal logic so it's less of a foot-gun for core contributors.

I'll just PR it.

ilya-bobyr · 2024-04-30T22:30:12Z

I felt it was better for DevEx to let those who specifically wanted to opt-out of auto-extend provide the flag, while everyone else got the auto-extend by default. Smoother for newbies.

Yes, I'm not suggesting changing the behavior for cli users. I'm suggesting flopping the internal logic so it's less of a foot-gun for core contributors.

I'll just PR it.

I was also looking at this test, and the double negation is pretty much everywhere and seems unnecessary.
I think, the CLI argument should be called extend or something like that. With true being the default. In which case, the users can use --no-extend to change it to false.
Internally, the name of the configuration parameter would match the name of the CLI argument (unlike what it is right now), and also be called extend.

CriesofCarrots · 2024-04-30T22:42:45Z

@ilya-bobyr #1119
Feel free to make more better, though

ilya-bobyr · 2024-05-01T01:55:55Z

I am not sure why the test passes sometimes. I printed out the length of the account after the command at L1091 and sometimes it's 4820 (the one-byte-too-small size) and sometimes it's 4821. With the change in this PR, it's always 4820. Possibly the extend transaction hadn't actually completed by the time this was called?

agave/cli/src/program.rs

Lines 2548 to 2550 in 5b3390b

if let Some(program_data_account) = rpc_client

.get_account_with_commitment(&program_data_address, config.commitment)?

.value

This call happens before we perform any operations, IIUC.
And the size of the program is extended only as a result of the logic below it.
So it seems a bit strange if this call would be somehow related to the issue.

Commitment seems consistent across all commands and client, though.
The deploy command must be failing for some other reason in the cases the test passes.

It is interesting that the length of the program is different, though...
I was also trying to debug this on Friday as well.
What I noticed is that if I enable the auto extend, the test passes with an error that is different from the error that it was originally seeing here:

agave/cli/tests/program.rs

Line 1108 in 5b3390b

process_command(&config).unwrap_err();

The trick is that because the test is not precise enough, any error causes it to pass.

If I change the test code to auto_extend: false, the error that causes the test to pass is this:

Err("Deploying program failed: RPC response error -32002: Transaction simulation failed: Error processing Instruction 0: account data too small for instruction [3 log messages]")

It seems like it comes from here:

agave/cli/src/program.rs

Lines 2896 to 2916 in 5b3390b

    
           if let Some(mut message) = final_message { 
        
               if let Some(final_signers) = final_signers { 
        
                   trace!("Deploying program"); 
        
                   simulate_and_update_compute_unit_limit(&rpc_client, &mut message)?; 
        
                   let mut final_tx = Transaction::new_unsigned(message); 
        
                   let blockhash = rpc_client.get_latest_blockhash()?; 
        
                   let mut signers = final_signers.to_vec(); 
        
                   signers.push(fee_payer_signer); 
        
                   final_tx.try_sign(&signers, blockhash)?; 
        
                   return Ok(Some( 
        
                       rpc_client 
        
                           .send_and_confirm_transaction_with_spinner_and_config( 
        
                               &final_tx, 
        
                               config.commitment, 
        
                               RpcSendTransactionConfig { 
        
                                   preflight_commitment: Some(config.commitment.commitment), 
        
                                   ..RpcSendTransactionConfig::default() 
        
                               }, 
        
                           ) 
        
                           .map_err(|e| format!("Deploying program failed: {e}"))?,

Are we somehow ignoring errors during the buffer update and just fail when we attempt to finalize the program?

I would expect it to fail here instead:

agave/cli/src/program.rs

Lines 2880 to 2891 in 5b3390b

    
           .map_err(|err| format!("Data writes to account failed: {err}"))? 
        
           .into_iter() 
        
           .flatten() 
        
           .collect::<Vec<_>>(); 
        
           if !transaction_errors.is_empty() { 
        
               for transaction_error in &transaction_errors { 
        
                   error!("{:?}", transaction_error); 
        
               } 
        
               return Err( 
        
                   format!("{} write transactions failed", transaction_errors.len()).into(), 
        
               );

But when the test succeeds with auto_extend: true, the error is different:

Err("Account allocation failed: RPC response error -32002: Transaction simulation failed: Error processing Instruction 2: invalid program argument [7 log messages]")

It originates here:

agave/cli/src/program.rs

Lines 2807 to 2808 in 5b3390b

    
           log_instruction_custom_error::<SystemError>(result, config) 
        
               .map_err(|err| format!("Account allocation failed: {err}"))?;

It seems like it is the buffer extend operation itself that fails?
I am not really sure.
This is where I got stuck on Friday and didn't get back to it until now.

ilya-bobyr · 2024-05-01T02:01:03Z

Separately, I would like to point out that this test has two issues:

The test is too long, like 4 tests combined.
And the state from the previous "test" is reused in the next one.
This makes it needlessly harder to debug.
Should have been 4 independent tests. And they can also run in parallel.
The error check is not precise enough.

t-nelson · 2024-05-01T04:15:32Z

the default is to spend the user's money, rather than have them opt-in to it?

ilya-bobyr · 2024-05-01T04:33:50Z

the default is to spend the user's money, rather than have them opt-in to it?

Good question. There is some reasoning here: #791 (comment)

Developers attempting to calculate this amount over and over again have become frustrated and some have just spent the money to allocate way more than they should. Some even as much as 10mb. This is bad both for the cluster and developer experience.

Looking at the rent except calculation here:

agave/sdk/program/src/rent.rs

Lines 73 to 77 in 9403ca6

    
           pub fn minimum_balance(&self, data_len: usize) -> u64 { 
        
               let bytes = data_len as u64; 
        
               (((ACCOUNT_STORAGE_OVERHEAD + bytes) * self.lamports_per_byte_year) as f64 
        
                   * self.exemption_threshold) as u64 
        
           }

I get:

ACCOUNT_STORAGE_OVERHEAD = 128
bytes = 10 * 1024 * 1024
lamports_per_byte_year = 3480
exemption_threshold = 2.0

(ACCOUNT_STORAGE_OVERHEAD + bytes) * lamports_per_byte_year * self.exemption_threshold

(128 + 10_485_760) * 3480 * 2 = 72_981_780_480

73 SOL is about $9k in today prices.

So I am not sure people would really allocate 10MBs just to save some computation. Even at $20, it is $1,460.

CriesofCarrots requested a review from buffalojoec April 29, 2024 22:55

CriesofCarrots added 2 commits April 29, 2024 17:00

Unignore test

37613f9

Don't extend program accounts by default on test that expects them no…

b6cac1b

…t to

CriesofCarrots force-pushed the fix-flaky-test branch from dca2b45 to b6cac1b Compare April 29, 2024 23:00

buffalojoec approved these changes Apr 29, 2024

View reviewed changes

CriesofCarrots merged commit 680946c into anza-xyz:master Apr 30, 2024
38 checks passed

yihau mentioned this pull request May 3, 2024

Flaky Test: solana-cli::program test_cli_program_extend_program #1173

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test_cli_program_extend_program #1115

Fix test_cli_program_extend_program #1115

CriesofCarrots commented Apr 29, 2024 •

edited

buffalojoec left a comment

codecov-commenter commented Apr 30, 2024

CriesofCarrots commented Apr 30, 2024

buffalojoec commented Apr 30, 2024

CriesofCarrots commented Apr 30, 2024 •

edited

ilya-bobyr commented Apr 30, 2024

CriesofCarrots commented Apr 30, 2024

ilya-bobyr commented May 1, 2024

ilya-bobyr commented May 1, 2024

t-nelson commented May 1, 2024

ilya-bobyr commented May 1, 2024 •

edited

	if let Some(program_data_account) = rpc_client
	.get_account_with_commitment(&program_data_address, config.commitment)?
	.value

Fix test_cli_program_extend_program #1115

Fix test_cli_program_extend_program #1115

Conversation

CriesofCarrots commented Apr 29, 2024 • edited

Problem

Summary of Changes

buffalojoec left a comment

Choose a reason for hiding this comment

codecov-commenter commented Apr 30, 2024

Codecov Report

CriesofCarrots commented Apr 30, 2024

buffalojoec commented Apr 30, 2024

CriesofCarrots commented Apr 30, 2024 • edited

ilya-bobyr commented Apr 30, 2024

CriesofCarrots commented Apr 30, 2024

ilya-bobyr commented May 1, 2024

ilya-bobyr commented May 1, 2024

t-nelson commented May 1, 2024

ilya-bobyr commented May 1, 2024 • edited

CriesofCarrots commented Apr 29, 2024 •

edited

CriesofCarrots commented Apr 30, 2024 •

edited

ilya-bobyr commented May 1, 2024 •

edited