You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's try to write out a simple (if that's even possible) example to demonstrate the cache's workflow with P$.
Example Program:
To start, say file1.txt exists and file2.txt does not exist.
Program opens file1.txt for reading only and reads the contents.
Program creates file2.txt (HOW it makes it is very important. Does it use creat or open? What mode does it open with? Does it use O_TRUNC? O_APPEND? Don't you just love this system call interface? Isn't this just so intuitive? 🧠)
Program writes to file2.txt.
Program exits.
file1.txt is an input, and its contents are read. We hash the file when we see it opened as read only.
I guess the executable is another input, should be hashed at the start as well?
Also all the usual suspects: cwd, environment variables, yada yada yada...
file2.txt is an output, as it is created and written to. We hash the file when the program exits. We would then copy the file to our cache.
How do we know we can skip?
The hashes of file1.txt and the executable should match ours and the file should be present in the file system in the same location it was before.
How do we skip?
We skip the execution (#42).
We can then copy our file2.txt to its appropriate absolute path for the execution. This also means we need to keep track of that path, if we need to copy the output file over.
Can we get away with not copying the output file over?
If the hashes of file1.txt and the executable matched, and also file2.txt matches and is in the right spot in the file system, we don't have to copy over the file.
Further thoughts:
What if the program only used one file file1.txt? It reads the contents. Then it writes to the file. I think we can handle this, whether it uses O_APPEND or O_TRUNC. This is a little in the weeds, probably represents edge cases, but important to think about and document nonetheless.
We hash the file when it's opened for reading, this is the input file.
We hash the file at the end of the execution, this is the output file.
When we see this execution again, if our input file matches the one the new execution is using, we can just replace this file by copying over the output file from the cache.
Roughly what I need to implement:
Alter data structures to include file name, full path, and hash of the file
Hash input files (access, openat, open, read, pread64, fstat, newfstatat, stat) when we first see the access.
Hash output files (creat, open, openat, write, writevat the end of the execution.
Serialize the data structure to a file.
Deserialize the data structure.
Look ups in the data structure.
Copy output files to the "cache" at the end of execution.
Copy output files from the "cache".
The text was updated successfully, but these errors were encountered:
Let's try to write out a simple (if that's even possible) example to demonstrate the cache's workflow with P$.
Example Program:
file1.txt
exists andfile2.txt
does not exist.file1.txt
for reading only and reads the contents.file2.txt
(HOW it makes it is very important. Does it usecreat
oropen
? What mode does it open with? Does it useO_TRUNC
?O_APPEND
? Don't you just love this system call interface? Isn't this just so intuitive? 🧠)file2.txt
.file1.txt
is an input, and its contents are read. We hash the file when we see it opened as read only.file2.txt
is an output, as it is created and written to. We hash the file when the program exits. We would then copy the file to our cache.How do we know we can skip?
The hashes of
file1.txt
and the executable should match ours and the file should be present in the file system in the same location it was before.How do we skip?
We skip the execution (#42).
We can then copy our
file2.txt
to its appropriate absolute path for the execution. This also means we need to keep track of that path, if we need to copy the output file over.Can we get away with not copying the output file over?
If the hashes of
file1.txt
and the executable matched, and alsofile2.txt
matches and is in the right spot in the file system, we don't have to copy over the file.Further thoughts:
What if the program only used one file
file1.txt
? It reads the contents. Then it writes to the file. I think we can handle this, whether it usesO_APPEND
orO_TRUNC
. This is a little in the weeds, probably represents edge cases, but important to think about and document nonetheless.Roughly what I need to implement:
access
,openat
,open
,read
,pread64
,fstat
,newfstatat
,stat
) when we first see the access.creat
,open
,openat
,write
,writev
at the end of the execution.The text was updated successfully, but these errors were encountered: