Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement downloading of remote gz images #34

Closed
wants to merge 14 commits into from

Conversation

Razaloc
Copy link
Contributor

@Razaloc Razaloc commented Oct 11, 2022

Allow bmap-rs to copy images from a remote file:

  • Accept also an HTTP/HTTPS url argument
  • Convert setup_input into two different functions to handle the input: a new one setup_remote_input and the previous one as setup_local_input
  • By now it error if the downloaded file is not a .gz file as "not implemented error"
  • Adapt code to the async case.

Closes: #9
Closes: #8

Signed-off-by: Rafael Garcia Ruiz rafael.garcia@collabora.com

This is a preparation to allow bmap-rs fetch files, so it needs to
accept urls as well as local image paths. At this point the url will be
accepted but the file wont be downloaded causing it to not be able to
find it.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
@Razaloc
Copy link
Contributor Author

Razaloc commented Oct 11, 2022

WIP

@Razaloc Razaloc force-pushed the wip/rafaelgarrui/download branch 2 times, most recently from 3185134 to bda5a15 Compare October 11, 2022 14:48
setup_input turned into two different functions setup_local_input and
setup_remote_input which is still to be completed since it needs async
enviroment.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
When using a remote image filter if the compression type isn't ".gz" the
only one support by now.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
setup_local_input and setup_remote_input now return a Input enum that
is used to switch between remote or local image origin. The download is
implemented so the file is divided in pieces and copied part by part.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
As with remote source the file to copy and the bmap are not in the same
folder we need to adapt the path, in this case url to find the bmap
file. It is supposed to be in the folder we are executing bmap-rs and
have same name plus ".bmap"

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
bmap::copy_async emulates the use of bmap::copy but adapting to async
cases. A new trait AsyncSeekForward is needed to substitute SeekForward
in those cases. This duplicity is meant to be reduced in the future.
Some new dependencies were needed.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
@Razaloc Razaloc force-pushed the wip/rafaelgarrui/download branch 6 times, most recently from 219f105 to 402a111 Compare October 13, 2022 16:23
bmap::copy_async needs inputs with different traits than bmap::copy
WIP

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
Get some help from Kov figuring out things about async traits with rust
while using read and write functions.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
bmap-rs/Cargo.toml Outdated Show resolved Hide resolved
@Razaloc Razaloc force-pushed the wip/rafaelgarrui/download branch 5 times, most recently from ab8ec0e to 9867819 Compare October 20, 2022 14:37
Adapt traits implementations to async.
Add https support.
Change the loop of data chunks to a stream.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
When writting the buf need the index range of the reader.
Fix some tipos and clippy warnings

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
@Razaloc Razaloc marked this pull request as ready for review October 21, 2022 11:51
Add index range into the reading buf for the async copy function to
avoid overflow.
Fix typo.

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
When downloading the image from a remote source the bmap will also be
searched remotly and downloaded.
The bmap url will be same as image url but changing the extension with
".bmap"
bmap/src/lib.rs formated with rustfmt

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
bmap-rs/src/main.rs Outdated Show resolved Hide resolved
Now the bmap data is kept in memory inside a Cursor instead than in a
File

Signed-off-by: Rafael Garcia Ruiz <rafael.garcia@collabora.com>
@sjoerdsimons
Copy link
Contributor

Can this be split out in multiple MR's building up topics in small steps (e.g. building up the async support in one MR, bugfixes for existing code in another) and also please do a clean rebase and squashing where it makes sense.

Comment on lines +86 to +107
async fn fetch_url_http(url: hyper::Uri) -> Result<Response<Body>, hyper::Error> {
let client = Client::new();
client.get(url.clone()).await
}
async fn fetch_url_https(url: hyper::Uri) -> Result<Response<Body>, hyper::Error> {
let https = HttpsConnector::new();
let client = Client::builder().build::<_, hyper::Body>(https);
client.get(url).await
}
async fn setup_remote_input(url: Uri, fpath: &Path) -> Result<Input> {
if fpath.extension().unwrap() != "gz" {
bail!("Image file format not implemented")
}
let input = Input::Remote(match url.scheme_str() {
Some("https") => fetch_url_https(url).await?,
Some("http") => fetch_url_http(url).await?,
_ => {
bail!("url is not http or https")
}
});
Ok(input)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please switch to using reqwest rather then Hyper; There is no reason for a tool like bmap to use a low-level http library like hyper directly. In particular that would avoid this sillyness to support multiple schemes.

Comment on lines +169 to +176
let mut c = Cursor::new(Vec::new());
while let Some(next) = bmap_res.data().await {
let chunk = next?;
c.write_all(&chunk)?;
}
c.set_position(0);
let mut xml = String::new();
c.read_to_string(&mut xml)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the fact you onjly need to solve this due to using Hyper; You don't want to use a cursor here or do this loop by hand, hyper has its hyper::body::to_bytes helper that collects everything already; followed by std::str::from_utf8

Comment on lines +197 to +203
let mut output = tokio::fs::OpenOptions::new()
.write(true)
.create(true)
.open(c.dest)
.await?;
ftruncate(output.as_raw_fd(), bmap.image_size() as i64)
.context("Failed to truncate file")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done outside of the input_type determination as it's not dependant on the input

Comment on lines +216 to +218
bmap::copy_async(&mut chunk, &mut output, &bmap).await?;
println!("Done: Syncing...");
output.sync_all().await.expect("Sync failure");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably also be generic ; There is no reason why local files should be sync and remote files async, they can be treated in the same way.

let mut left = forward as usize;
while left > 0 {
let toread = left.min(buf.len());
let r = block_on(self.read(&mut buf[0..toread]))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't call block_on in an async method; as that starts a new executors that blocks the overall executor; Instead do a proper async read

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right so that explains why in my testing CPU usage would peak quite high

@Razaloc
Copy link
Contributor Author

Razaloc commented Oct 24, 2022

#41 Takes the essential for the download feature, apply the use of reqwest and avoid using cursor.
I've left the finding remote bmap for next PR.

@Razaloc
Copy link
Contributor Author

Razaloc commented Oct 24, 2022

#42 Is a small fix to adapt the version numbers to dependabot.

@Razaloc Razaloc closed this Nov 25, 2022
@Razaloc Razaloc deleted the wip/rafaelgarrui/download branch December 1, 2022 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants