Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration and improvements of File API backends #11875

Merged
merged 1 commit into from Jul 4, 2016
Merged

Integration and improvements of File API backends #11875

merged 1 commit into from Jul 4, 2016

Conversation

izgzhen
Copy link
Contributor

@izgzhen izgzhen commented Jun 27, 2016

Basically three major changes:

  1. More complete origin check in FileManagerThreadMsg
  2. Add reference counting logic to file manage store and script API
  3. Integrate the support of slicing

r? @Manishearth

Part of #11131


This change is Reviewable

@highfive
Copy link

Heads up! This PR modifies the following files:

  • @KiChjang: components/net/filemanager_thread.rs, components/net_traits/blob_url_store.rs, components/net_traits/blob_url_store.rs, components/script/dom/url.rs, components/net/blob_loader.rs, components/net_traits/filemanager_thread.rs, components/net_traits/filemanager_thread.rs, components/script/dom/bindings/trace.rs, components/script/dom/htmlinputelement.rs, components/script/dom/blob.rs

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Jun 27, 2016
@highfive
Copy link

warning Warning warning

  • These commits modify net and script code, but no tests are modified. Please consider adding a test!

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Not complete implementation yet, but the general ideas are as shown in the code.

This PR might take a long time to review, and I am considering adding some tests as well.

@emilio
Copy link
Member

emilio commented Jun 27, 2016

Just reviewing style nits and similar as before to speed things up, Manish will still need to review the rest :)

+S-needs-code-changes


Review status: 0 of 8 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


components/net_traits/filemanager_thread.rs, line 11 [r1] (raw file):

#[derive(Clone, Deserialize, Serialize)]
pub struct RelativePos {

So the rust standard library has an struct Range<Idx> for this purpose. If the problem for this is that it's not serializable, I think it's better to either pr bincode or the appropriate repo, or implement it here.


components/net_traits/filemanager_thread.rs, line 12 [r1] (raw file):

#[derive(Clone, Deserialize, Serialize)]
pub struct RelativePos {
    pub start: Option<i64>,

Also, if this isn't negative should be an unsigned integer, like usize.


components/net_traits/filemanager_thread.rs, line 19 [r1] (raw file):

    pub fn full() -> RelativePos {
        RelativePos {
            start: None,

Also, thinking about it a bit more, it seems you always want a start, even though it might be 0, and the None case would represent "until the end of the blob" I guess, in which case you need the custom struct.


components/script/dom/blob.rs, line 293 [r1] (raw file):

                BlobImpl::File(BlobFileImpl {
                    id: fimpl.id.clone(),
                    cached: DOMRefCell::new(None),

We probably should try to use fimpls cached version here to keep the user-visible behavior consistent, since if not a slice can really represent other memory if the underlying file changes.


components/script/dom/blob.rs, line 295 [r1] (raw file):
You should probably add the indices to the start field of the current blob, right? i.e., if you have a slice from position 5 to 10, and you want to slice(2, 3), I guess you want to have the slice 7, 8, and not 2, 3.

Also, I don't know what the requirements about bounds-checking are, but it seems you need to know the size of the blob to make the calculation in case the start param is negative:

If start is negative, let relativeStart be max((size + start), 0).

If you plan to do all this that lazily (which is smart btw) you might need the negative integer type for indices, but you can always replace the None value by 0.

Also, the problem with it being lazy and not having the file in memory, is that the contents of a slice can be different of those from the sliced blob, hence my previous comment.

In any case, we should try to check how other browsers handle this.


components/script/dom/htmlinputelement.rs, line 1140 [r1] (raw file):

            InputType::InputFile => {
                let window = window_from_node(self);
                let origin = window.get_url().origin().unicode_serialization();

Can't we just send the Origin structure?


components/script/dom/url.rs, line 128 [r1] (raw file):

        }

        let slice = match blob.get_blob_impl() {

nit: you should be able to do

match *blob.get_blob_impl() {
    BlobImpl::File(ref f) => { ...

which reads a bit nicer IMHO.


Comments from Reviewable

@emilio emilio added the S-needs-code-changes Changes have not yet been made that were requested by a reviewer. label Jun 27, 2016
struct FileManager<UI: 'static + UIProvider> {
receiver: IpcReceiver<FileManagerThreadMsg>,
idmap: HashMap<Uuid, PathBuf>,
store: Arc<RwLock<HashMap<Uuid, FileStoreEntry>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the entries themselves have some locking mechanism so that you only need to W-lock the store for small operations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a finer-grained lock per entry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on insertion of a buffer entry (which is slow), you insert an empty entry, then downgrade the main lock to a read lock, and take a write lock to the entry and fill it in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be handled later. I'd prefer if there was a nice concurrent hashmap out there (there probably is) that can be used to abstract over the downgrading.

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Cool, thanks!


Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


components/net_traits/filemanager_thread.rs, line 11 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

So the rust standard library has an struct Range<Idx> for this purpose. If the problem for this is that it's not serializable, I think it's better to either pr bincode or the appropriate repo, or implement it here.

That looks good, but I wonder if it is proper to introduce the extra semantics of Rust standard here? Since that might not be the same concept of "slice" in Rust

components/net_traits/filemanager_thread.rs, line 12 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

Also, if this isn't negative should be an unsigned integer, like usize.

Good point, I used to write it that way but found that I can't compute the unsigned offsets unless I know the total length of bytes (as suggested by name "Relative")

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


components/script/dom/htmlinputelement.rs, line 1140 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

Can't we just send the Origin structure?

Sorry to say that it is blocked by https://github.com//issues/11722. I will fix it after the Origin can be safely serialized.

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


components/script/dom/url.rs, line 128 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

nit: you should be able to do

match *blob.get_blob_impl() {
    BlobImpl::File(ref f) => { ...

which reads a bit nicer IMHO.

Nice 😆 I always doubt how should I write this better

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


components/script/dom/blob.rs, line 293 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

We probably should try to use fimpls cached version here to keep the user-visible behavior consistent, since if not a slice can really represent other memory if the underlying file changes.

But that will force the content to be read and cached etc. if it is not cached yet. This is a good point though and we have been wondering how to handle this best for some time. @Manishearth any suggestion on this?

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


components/script/dom/blob.rs, line 295 [r1] (raw file):

Previously, emilio (Emilio Cobos Álvarez) wrote…

You should probably add the indices to the start field of the current blob, right? i.e., if you have a slice from position 5 to 10, and you want to slice(2, 3), I guess you want to have the slice 7, 8, and not 2, 3.

Also, I don't know what the requirements about bounds-checking are, but it seems you need to know the size of the blob to make the calculation in case the start param is negative:

If start is negative, let relativeStart be max((size + start), 0).

If you plan to do all this that lazily (which is smart btw) you might need the negative integer type for indices, but you can always replace the None value by 0.

Also, the problem with it being lazy and not having the file in memory, is that the contents of a slice can be different of those from the sliced blob, hence my previous comment.

In any case, we should try to check how other browsers handle this.

For the first point, [it looks like](https://w3c.github.io/FileAPI/#slice-method-algo) that the original slice position are not considered if you do `slice` on the blob again, i.e. we replace the slice positions.

For the second point, it is really a good insight that None is equivalent to 0 in the first field, I will consider this further.


Comments from Reviewable

@Manishearth
Copy link
Member

Reviewed 8 of 8 files at r1.
Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


components/net/filemanager_thread.rs, line 192 [r1] (raw file):

    fn select_files(&mut self, patterns: Vec<FilterPattern>,
                    sender: IpcSender<FileManagerResult<Vec<SelectedFile>>>,
                    origin: String) {

could we type FileOrigin = String somewhere in this file so that it is easy to update later?


components/net_traits/filemanager_thread.rs, line 11 [r1] (raw file):

Previously, izgzhen (Zhen Zhang) wrote…

That looks good, but I wonder if it is proper to introduce the extra semantics of Rust standard here? Since that might not be the same concept of "slice" in Rust

The Rust stlib type is a simple one, that's fine to use.

components/net_traits/filemanager_thread.rs, line 67 [r1] (raw file):

    AddIndirectEntry(SelectedFileId, RelativePos, IpcSender<Result<String, BlobURLStoreError>>, String),

    /// Increate reference

nit:spelling


components/script/dom/blob.rs, line 284 [r1] (raw file):

        let blob_impl = match self.blob_impl {
            BlobImpl::File(ref fimpl) => {

Note that once you have slice impls on the DOM side, this will get a bit complicated; since it's no longer a single atomic incref that you need to do.


components/script/dom/blob.rs, line 293 [r1] (raw file):

Previously, izgzhen (Zhen Zhang) wrote…

But that will force the content to be read and cached etc. if it is not cached yet. This is a good point though and we have been wondering how to handle this best for some time. @Manishearth any suggestion on this?

Most browsers allow the underlying file to change and just truncate newly-out-of-bounds slices

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


components/net/filemanager_thread.rs, line 192 [r1] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

could we type FileOrigin = String somewhere in this file so that it is easy to update later?

Nice suggestion

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 27, 2016

Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


components/script/dom/blob.rs, line 284 [r1] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

Note that once you have slice impls on the DOM side, this will get a bit complicated; since it's no longer a single atomic incref that you need to do.

Hmm? Can you elaborate more on that?

Comments from Reviewable

@highfive highfive removed the S-needs-code-changes Changes have not yet been made that were requested by a reviewer. label Jun 29, 2016
@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 29, 2016

Review status: 2 of 13 files reviewed at latest revision, 11 unresolved discussions.


components/net/filemanager_thread.rs, line 112 [r1] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

I think this can be handled later. I'd prefer if there was a nice concurrent hashmap out there (there probably is) that can be used to abstract over the downgrading.

I removed the `Arc

Comments from Reviewable

@Manishearth
Copy link
Member

Review status: 2 of 13 files reviewed at latest revision, 9 unresolved discussions.


components/net/filemanager_thread.rs, line 112 [r1] (raw file):

Previously, izgzhen (Zhen Zhang) wrote…

I removed the Arc<RwLock on store and change FileStoreEntry::refs to AtomicUsize. But then I wonder if there is a possibility of race at all since the store is totally owned by the thread and the thread is tied to only one event loop?

Oh, in that case we don't need a lock or atomic at all. However, I thought this event loop spawns individual threads for each request? If not, it probably should.

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jun 29, 2016

Review status: 2 of 13 files reviewed at latest revision, 10 unresolved discussions.


components/net/filemanager_thread.rs, line 112 [r1] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

Oh, in that case we don't need a lock or atomic at all. However, I thought this event loop spawns individual threads for each request? If not, it probably should.

Not really, but maybe we should consider that in case certain request might pertains too long (or async)?

Comments from Reviewable

@Manishearth
Copy link
Member

Reviewed 13 of 13 files at r2.
Review status: all files reviewed at latest revision, 13 unresolved discussions.


components/net/filemanager_thread.rs, line 112 [r1] (raw file):

Previously, izgzhen (Zhen Zhang) wrote…

Not really, but maybe we should consider that in case certain request might pertains too long (or async)?

Yes, we should, but this we can handle later. With pooling or something.

components/net/filemanager_thread.rs, line 165 [r2] (raw file):

                Some(entry) => {
                    if entry.origin == origin_in {
                        entry.refs.fetch_add(1, Ordering::Relaxed);

This doesn't need to be atomic, since you already have &mut self


components/net/filemanager_thread.rs, line 182 [r2] (raw file):

                        let _ = sender.send(Err(BlobURLStoreError::InvalidOrigin));
                        return;
                    }

we should incref here, no?


components/script/dom/blob.rs, line 98 [r2] (raw file):

    /// Sliced memory-backed blob, including pointer to parent blob and
    /// a data slice representing bytes in currently sliced range
    SlicedMemory(JS<Blob>, DataSlice),

Can't we just represent slicedfile by SlicedMemory or File?

If it is a slice that does not have its own uuid, represent it as SlicedMemory(parent_blob, slice). If it does have its own uuid, just pretend it's a File, since from the script thread's point of view files and slices with uuids are just uuids.


components/script/dom/blob.rs, line 254 [r2] (raw file):

    }

    #[allow(unrooted_must_root)]

This shouldn't be here. Only on new() methods. Add new_foo functions on blob that take in an &JS or &Blob and and return a Root, with different impls


Comments from Reviewable

@Manishearth Manishearth added S-needs-code-changes Changes have not yet been made that were requested by a reviewer. and removed S-awaiting-review There is new code that needs to be reviewed. labels Jul 3, 2016
@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 3, 2016

Review status: all files reviewed at latest revision, 14 unresolved discussions.


components/net/filemanager_thread.rs, line 165 [r2] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

This doesn't need to be atomic, since you already have &mut self

Good point .... I am hesitating if or not doing spawning in this PR ... anyway I will take the `Atomic` down for now

Comments from Reviewable

@Manishearth
Copy link
Member

To be clear, I'm okay with having all three of SlicedFile, SlicedMemory, and File, but I'd like to know why we need it 😄

@Manishearth
Copy link
Member

Right, don't worry about spawning/threading/etc here. If we do move this to pooling, we probably will do it along with the net stack.

@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 3, 2016

Review status: all files reviewed at latest revision, 14 unresolved discussions.


components/net/filemanager_thread.rs, line 182 [r2] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

we should incref here, no?

Well ... I suppose that the holder of this indirect UUID will revoke the indirect one so itself is serving like an incref except with extra boundaries

Comments from Reviewable

@Manishearth
Copy link
Member

Review status: all files reviewed at latest revision, 14 unresolved discussions.


components/net/filemanager_thread.rs, line 182 [r2] (raw file):

Previously, izgzhen (Zhen Zhang) wrote…

Well ... I suppose that the holder of this indirect UUID will revoke the indirect one so itself is serving like an incref except with extra boundaries

Incref should be encapsulated as much as possible; and the API consumer should rarely (if ever) have to incref themselves.

Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 3, 2016

Review status: all files reviewed at latest revision, 14 unresolved discussions.


components/script/dom/blob.rs, line 98 [r2] (raw file):

Previously, Manishearth (Manish Goregaokar) wrote…

Can't we just represent slicedfile by SlicedMemory or File?

If it is a slice that does not have its own uuid, represent it as SlicedMemory(parent_blob, slice). If it does have its own uuid, just pretend it's a File, since from the script thread's point of view files and slices with uuids are just uuids.

I think we can. Just the `SlicedMemory(JS, DataSlice)` will need to be generalized into `Sliced(JS, RelativePos)` and I think it is like adding another layer of indirection if we generalize here.

And if we consider the case when data blob parent gets "promoted" into uuid blob then things might be a bit different. I'd like to think for a while on this point.


Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 4, 2016

Review status: 9 of 13 files reviewed at latest revision, 17 unresolved discussions.


components/net/filemanager_thread.rs, line 334 [r5] (raw file):

    }

    fn dec_ref(&mut self, id: Uuid, origin_in: FileOrigin) {

@Manishearth The destruction is here


Comments from Reviewable

@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 4, 2016

@Manishearth Now all fixed, you can review the change when convenient

@Manishearth
Copy link
Member

@bors-servo r+


Reviewed 4 of 4 files at r5, 1 of 1 files at r6.
Review status: all files reviewed at latest revision, 17 unresolved discussions.


Comments from Reviewable

@bors-servo
Copy link
Contributor

📌 Commit 1e9992b has been approved by Manishearth

@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. S-needs-rebase There are merge conflict errors. labels Jul 4, 2016
bors-servo pushed a commit that referenced this pull request Jul 4, 2016
Integration and improvements of File API backends

Basically three major changes:

1. More complete origin check in `FileManagerThreadMsg`
2. Add reference counting logic to file manage store and script API
3. Integrate the support of slicing

r? @Manishearth

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11875)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

⌛ Testing commit 1e9992b with merge fc5fdd8...

@bors-servo
Copy link
Contributor

💔 Test failed - mac-dev-unit

@highfive highfive added S-tests-failed The changes caused existing tests to fail. and removed S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. labels Jul 4, 2016
@Manishearth
Copy link
Member

diff --git a/ports/cef/Cargo.lock b/ports/cef/Cargo.lock
index 7150e13..949a8e2 100644
--- a/ports/cef/Cargo.lock
+++ b/ports/cef/Cargo.lock
@@ -1359,6 +1359,7 @@ dependencies = [
  "lazy_static 0.2.1 (registry+https://github.com/rust-lang/crates.io-index)",
  "log 0.3.6 (registry+https://github.com/rust-lang/crates.io-index)",
  "msg 0.0.1",
+ "num-traits 0.1.32 (registry+https://github.com/rust-lang/crates.io-index)",
  "serde 0.7.11 (registry+https://github.com/rust-lang/crates.io-index)",
  "serde_macros 0.7.11 (registry+https://github.com/rust-lang/crates.io-index)",
  "url 1.1.1 (registry+https://github.com/rust-lang/crates.io-index)",

@Manishearth
Copy link
Member

Please apply that diff and commit 😄

1. More complete origin check in FileManagerThreadMsg
2. Add reference counting logic to file manage store and script API
3. Integrate the support of slicing
@highfive highfive added S-awaiting-review There is new code that needs to be reviewed. and removed S-tests-failed The changes caused existing tests to fail. labels Jul 4, 2016
@izgzhen
Copy link
Contributor Author

izgzhen commented Jul 4, 2016

Pushed

@Manishearth
Copy link
Member

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 14d6896 has been approved by Manishearth

@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. labels Jul 4, 2016
@bors-servo
Copy link
Contributor

⌛ Testing commit 14d6896 with merge 36974f0...

bors-servo pushed a commit that referenced this pull request Jul 4, 2016
Integration and improvements of File API backends

Basically three major changes:

1. More complete origin check in `FileManagerThreadMsg`
2. Add reference counting logic to file manage store and script API
3. Integrate the support of slicing

r? @Manishearth

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11875)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - android, arm32, arm64, linux-dev, linux-rel, mac-dev-unit, mac-rel-css, mac-rel-wpt, windows

@bors-servo bors-servo merged commit 14d6896 into servo:master Jul 4, 2016
@highfive highfive removed the S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. label Jul 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants