Skip to content
Permalink
Browse files

tweaks

  • Loading branch information...
lihaoyi committed Jun 3, 2019
1 parent cfccd44 commit 32ccf9fe457f625243abf6903fd6095ff8d825c3
Showing with 169 additions and 3 deletions.
  1. +169 −3 post/36 - How to work with Files in Scala.md
@@ -568,7 +568,175 @@ res60: IndexedSeq[(Long, os.Path)] = ArrayBuffer(
)
```

While this is only one use case, the
## Use Case: Folder Syncing

Let's walk through a second use case: write a program that will take a source
and destination folder, and efficiently update the destination folder to look
like the source folder as files are added to it or modified (for simplicity, we
will ignore deletions).

```scala
@ val src = os.pwd / "post"; val dest = os.pwd / "post-copy"
src: os.Path = root/'Users/'lihaoyi/'Github/'blog/'post
dest: os.Path = root/'Users/'lihaoyi/'Github/'blog/"post-copy"
```

Lets also assume that simply deleting the destination and re-copying the source
over is to inefficient:

```scala
@ os.remove.all(dest)
@ os.copy.all(src, dest)
```

And we want to do it on a per-file/folder basis.

To begin with, we need to recursively walk all contents of the source folder

```scala
@ val srcContents = os.walk(src)
srcContents: IndexedSeq[os.Path] = ArrayBuffer(
root/'Users/'lihaoyi/'Github/'blog/'post/"9 - Micro-optimizing your Scala code.md",
root/'Users/'lihaoyi/'Github/'blog/'post/"24 - How to conduct a good Programming Interview.md",
root/'Users/'lihaoyi/'Github/'blog/'post/"23 - Scala Vector operations aren't \"Effectively Constant\" time.md",
root/'Users/'lihaoyi/'Github/'blog/'post/'Reimagining,
root/'Users/'lihaoyi/'Github/'blog/'post/'Reimagining/"GithubSearch.png",
...
```

Then, we iterate over every entry, and see if its a file or folder:

```scala
@ for(path <- srcContents) println(os.isDir(path))
false
false
false
true
false
false
```

For simplicity, we'll ignore the presence of symbolic links, detectable via
`os.isLink`.

We can find the corresponding `isDir` for the destination path using:

```scala
@ for(path <- srcContents) println(os.isDir(dest / path.relativeTo(src)))
false
false
false
false
false
false
```

For now, the source folder doesn't exist, so `isDir` returns `false` on all of
the paths.

Next, we walk over the `srcContents` and the corresponding paths in `dest`
together, and if they differ, delete the destination sub-path and copy the
source sub-path over

```scala
@ for(srcSubPath <- srcContents) {
val destSubPath = dest / srcSubPath.relativeTo(src)
(os.isDir(srcSubPath), os.isDir(destSubPath)) match{
case (false, true) | (true, false) => os.copy.over(srcSubPath, destSubPath)
case (false, false)
if !os.exists(destSubPath)
|| os.read.bytes(srcSubPath) != os.read.bytes(destSubPath) =>
os.copy.over(srcSubPath, destSubPath, createFolders = true)
case _ => // do nothing
}
}
```

Now, we can walk the `dest` path and see all our contents in place:

```scala
@ os.walk(dest)
res13: IndexedSeq[os.Path] = ArrayBuffer(
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/"9 - Micro-optimizing your Scala code.md",
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/"24 - How to conduct a good Programming Interview.md",
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/"23 - Scala Vector operations aren't \"Effectively Constant\" time.md",
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/'Reimagining,
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/'Reimagining/"GithubSearch.png",
root/'Users/'lihaoyi/'Github/'blog/"post-copy"/'Reimagining/"GithubBrowsing.gif",
```

We can wrap this all in a function for easy usage:

```scala
@ def sync(src: os.Path, dest: os.Path) = {
val srcContents = os.walk(src)
for(srcSubPath <- srcContents) {
val destSubPath = dest / srcSubPath.relativeTo(src)
(os.isDir(srcSubPath), os.isDir(destSubPath)) match{
case (false, true) | (true, false) => os.copy.over(srcSubPath, destSubPath)
case (false, false)
if !os.exists(destSubPath)
|| os.read.bytes(srcSubPath) != os.read.bytes(destSubPath) =>
os.copy.over(srcSubPath, destSubPath, createFolders = true)
case _ => // do nothing
}
}
}
defined function syncAdd
```

To test incremental updates, we can try adding an entry to the `src` folder:

```scala
@ os.write(src / "ABC.txt", "Hello World")
```

Running the sync:

```scala
@ sync(src, dest)
```

We can then see our file has been synced over to `dest`

```scala
@ os.exists(dest / "ABC.txt")
res29: Boolean = true
@ os.read(dest / "ABC.txt")
res30: String = "Hello World"
```

And modifications to that file also get synced over:

```scala
@ os.write.append(src / "ABC.txt", "\nI am Cow")
@ sync(src, dest)
@ os.read(dest / "ABC.txt")
res33: String = """Hello World
I am Cow"""
```

This use case is greatly simplified for simplicity so it can fit within a blog
post: we do not consider deletions, syncing permissions, or fine-grained
sub-file level syncing of data (e.g. Dropbox famously syncs in
[4mb blocks](https://www.dropbox.com/developers/reference/content-hash)).
Nevertheless, it should give you a good sense of how working with the filesystem
via Scala's OS-Lib library works, and you can easily extend it if you need more
functionality

## Conclusion


While we have only covered two use cases in this post, the
[OS-Lib Cookbook](https://github.com/lihaoyi/os-lib#cookbook) has several other
use cases you can browse to see how file handling works in a wider variety of
situations:
@@ -583,8 +751,6 @@ you can do and how to do them:

- [Documentation](https://github.com/lihaoyi/os-lib#os-lib---)

## Conclusion

Dealing with files and folders in Scala doesn't need to be difficult or verbose.
With the OS-Lib library, querying information about the filesystem is both
convenient and safe: you can accomplish what you want in very little code, while

0 comments on commit 32ccf9f

Please sign in to comment.
You can’t perform that action at this time.