It lists the files, that you keep on your storage drives, as a common file browser. It's aimed to help digital hoarders finding their files.
In other words: it's a file manager snapshot explorer, or a file explorer for JSON snapshots with meta information of local files.
The scanner for creating the snapshots is included.
Currently, it's implemented this way:
- The website is used as the explorer. (You can also run it locally.)
- JSON files (snapshots) are used as a database for the explorer.
- The scanner, a JavaScript file, is used for creating snapshots. You need to run it with Node.js from a terminal.
Yeah, it would much more convenient if it was a standalone GUI application, but anyway it's usable.
So, you can create a meta snapshot of your external/local hard drive's content and use this explorer to look at the content of the hard drive when it's detached. For the local hard drives it is pretty too, it can be an alternative for Windows Explorer's search thing.
There are two parts of the software:
- The scanner. It creates a JSON snapshots with meta information (
name
,type
,size
,mtime
,crtime
,hardlink
/symlink
info) of your local files. (See: how to use ↓)) - The explorer to open these JSON snapshots.
Just look at the examples (click on the links to open the site with the demo scans):
Win 10 scan (as Admin) | Win 10 scan
Ubuntu scan (as Root) | Ubuntu scan
Mega.nz folders
Also, it supports mega.nz folder links (like this: https://mega.nz/folder/ABcDE123#XXX
).
Just paste them in the search field.
Only for a browsing, but even this is useful since you can use the advanced search of this explorer (for example, to find the recently added files, find a file by its size).
Some search examples:
For case-sensitive search use //
, for example: //.EXE
Search by file type, for example: /type:folder/query
(folder
, file
, symlink
for Windows and fifo
, charDev
, blockDev
, socket
in additional for other platforms).
To list recursively all items of an opened folder use //
search.
/size:0
— find 0 byte size entries; aka/size/0
aka/s:0
aka/s/0
/s/120,900
— find 120900 bytes size entries; aka/s/120 900
/size:120+80
— find from 120 to 200/size:150+-50
— find from 100 to 150/size:80-110
— find from 80 to 110; aka/size:110-80
/size:200~50
— find from 150 to 250
/size:^2
— find byte size starts with "2"/size:%2
— find byte size includes "2"/size:$0
— find byte size ends with "0"
/size:120~
— find from 120 -5% to 120 +5%/size:120~~
— find from 120-10% to 120+10%/size:120~~~
— find from 120-15% to 120+15%
-
/sizek:5
— find 5 KB ± 0.1 KB -
/sizek:50
— find 50 KB ± 1 KB -
/sizek:500
— find 500 KB ± 1 KB -
/sizem:5
— find 5 MB ± 0.1 MB -
/sizeg/50
— find 50 GB ± 1 GB -
/size:5m
— find 5 MB ± 0.1 MB
/s/12.9
— find 12 bytes size entries/sk/12.9
— find 12.9 KB ± 1 KB
/sizek:5!
— find 5 KB + (0 - 0.01) KB/sizek:5!!
— find 5 KB + (0 - 0.001) KB/s/5k!!
— find 5 KB + (0 - 0.001) KB/sizem:50!
— find 50 MB + (0 - 0.1) MB/sizem:50!!
— find 50 MB + (0 - 0.01) MB
You can search by pasting URL in the search fiels, if the URL's pathname ends with an information included in file names you will find it, for example:
Search https://i.imgur.com/x09ICAM.jpeg
will list:
- [imgur] 2015.09.18—x09ICAM—3456.jpg
Search https://pbs.twimg.com/media/FFyAEbTUYAEUm9p?format=jpg&name=4096x4096
will list:
- [twitter] SpaceX—2021.12.04—1467202583840243712—FFyAEbTUYAEUm9p.jpg
Search https://twitter.com/SpaceX/status/1463536409667530755
will list:
- [twitter] SpaceX—2021.11.24—1463536409667530755—FE-GSYWUcAgc9BN.jpg
- [twitter] SpaceX—2021.11.24—1463536409667530755—FE-GWCKVgAorchy.jpg
- [twitter] SpaceX—2021.11.24—1463536409667530755—FE-GXQYUYAcfvUl.jpg
BTW, look at this thing.
Search https://www.instagram.com/p/CWqleONFgK4/
will list:
- [inst] spacex—2021.11.24—CWqleONFgK4—260727536_471249894638202_6939249621560480797_n.jpg
- [inst] spacex—2021.11.24—CWqleONFgK4—259683601_272614248155822_44594239323667647_n.jpg
Search https://www.youtube.com/watch?v=_qwLHlVjRyw
will list:
- [yt] SpaceX—2020.12.23—_qwLHlVjRyw—Starship _ SN8 _ High-Altitude Flight Recap.description
- [yt] SpaceX—2020.12.23—_qwLHlVjRyw—Starship _ SN8 _ High-Altitude Flight Recap.webm
- [yt] SpaceX—2020.12.23—_qwLHlVjRyw—Starship _ SN8 _ High-Altitude Flight Recap.webp
Search https://gfycat.com/incompletealarmedicelandichorse
will list:
- [gfycat] blaze0044—2021.03.27—IncompleteAlarmedIcelandichorse—Space X rocket breakup.mp4
- [gfycat] blaze0044—2021.03.27—IncompleteAlarmedIcelandichorse—Space X rocket breakup.webm
BTW, looks at this thing.
It will CamelCase incompletealarmedicelandichorse
to IncompleteAlarmedIcelandichorse
.
It supports multiple scans selecting. Just open two (or more scans) with the file input, or just drag'n'drop multiple scan files into the site.
Additionally, you can open multiple tabs and add searchSync=true
URLSearchParam, for example:
- ...?filepath=.../windows-scan.json.gz&searchSync=true
- ...?searchSync=true&filepath=.../ubuntu-scan.json.gz
The search in one tab will trigger the search in other tabs with searchSync=true
URLSearchParam.
Also, each tab will display the search result count.
Just use this link for your scans:
In short.
- Installed Node.js is required,
- Download the scanner file — qq-keep-lister-scanner.mjs,
- Open a terminal (CMD.exe, for example) in a folder you want to scan,
- Run the scanner (JS file) with Node.js — type in a terminal (for Windows with CMD):
node %USERPROFILE%\Downloads\qq-keep-lister-scanner.mjs
- Don't forget to type enter.
If you use Linux your command looks like this:
~/Downloads/qq-keep-lister-scanner.mjs
In progress:
The result will be in your download folder.
Note: replace %USERPROFILE%\Downloads\qq-keep-lister-scanner.mjs
with the path to the scanner file if you have moved it from the download folder to another place.
To explore the scan result use https://alttiri.github.io/keep-lister/ site.
Or run it locally:
- Download the source code keep-lister-master.zip,
- Unpack it,
- Open terminal in
keep-lister-master
folder, - Type
npm ci
, - Then
npm run build
, - Then
npm run serve
.
The site will be available on http://localhost:5000/. Use Ctrl + C
in the console to stop the server. Next time you only need to type npm run serve
to start the server.
It uses Windows-like file size formatting (1133158 bytes
→ 1.08 MB
).
In most cases the result is equal to Windows Explorer result, but in very rare cases is not. See the test file.
JSON scans can have a noticeable size, so they are gzipped to reduce the size in 5-10 times. For example, Windows' disk C scan (300k files, 90k folders) takes 5.8 MB gzipped (55 MB of raw JSON). The explorer handles the scan in stream way, so it displays the result as soon as the first bytes of the scan are read.
JSON snapshot is a valid JSON file, but it is special formatted to simplify the stream parsing and opening with text editors.
Just look at the example (Note: some lines are trimmed to reduce size):
[
{
"path": ["C:", "Users", "User", "Downloads"],
"separator": "\\",
"scanDate": "2021.09.24 23:52:02Z",
"platform": "win32",
"files": 37,
"folders": 7,
"symlinks": 0,
"fifos": 0,
"charDevs": 0,
"blockDevs": 0,
"sockets": 0,
"total": 44,
"errors": 0,
"mHardLinks": 0,
"mHardLinksTotal": 0,
"errorsMap": {}
},
{"type":"folder","name":"keep-lister-master","pid":null,"id":0,"mtime":1632527508334,"btime":1632527508312},
{"type":"folder","name":".github","pid":0,"id":1,"mtime":1632526451000,"btime":1632527508312},
{"type":"file","name":".gitignore","pid":0,"mtime":1632526451000,"btime":1632527508314,"size":53},
{"type":"file","name":"index.html","pid":0,"mtime":1632526451000,"btime":1632527508315,"size":843},
{"type":"file","name":"package-lock.json","pid":0,"mtime":1632526451000,"btime":1632527508315,"size":70293},
{"type":"file","name":"package.json","pid":0,"mtime":1632526451000,"btime":1632527508316,"size":874},
{"type":"file","name":"README.md","pid":0,"mtime":1632526451000,"btime":1632527508314,"size":4437},
{"type":"folder","name":"scanner","pid":0,"id":2,"mtime":1632526451000,"btime":1632527508316},
{"type":"folder","name":"src","pid":0,"id":3,"mtime":1632526451000,"btime":1632527508319},
{"type":"folder","name":"tests","pid":0,"id":4,"mtime":1632526451000,"btime":1632527508332},
{"type":"file","name":"vite.config.js","pid":0,"mtime":1632526451000,"btime":1632527508334,"size":5379},
{"type":"folder","name":"workflows","pid":1,"id":5,"mtime":1632526451000,"btime":1632527508313},
{"type":"file","name":"blank.yml","pid":5,"mtime":1632526451000,"btime":1632527508313,"size":3260},
{"type":"file","name":"flat-scan-object.js","pid":2,"mtime":1632526451000,"btime":1632527508316,"size":5087},
{"type":"file","name":"meta.js","pid":2,"mtime":1632526451000,"btime":1632527508317,"size":4119},
{"type":"file","name":"scanner.js","pid":2,"mtime":1632526451000,"btime":1632527508318,"size":8607},
{"type":"file","name":"util-node.js","pid":2,"mtime":1632526451000,"btime":1632527508318,"size":7314},
{"type":"file","name":"App.vue","pid":3,"mtime":1632526451000,"btime":1632527508319,"size":288},
{"type":"folder","name":"components","pid":3,"id":6,"mtime":1632526451000,"btime":1632527508319},
{"type":"file","name":"AddressBar.vue","pid":6,"mtime":1632526451000,"btime":1632527508320,"size":854}
]
There are folder
and search
variables in the browser console to handle entries of "opened folder"/"search result" in program way.
For example:
// List names of files in an opened folder as one string
folder.files.map(e => e.name).join("\n")
// Find the most long filenames
// (`flat` (`folder.flat()`) recursively lists all files of the selected directory in an array)
flat.reduce((acc, entry) => {
const name = entry.name;
const length = acc[0].length;
if (name.length > length) {
return [name];
}
if (name.length === length) {
acc.push(name);
return acc;
}
return acc;
}, [""])
If you have files with the special filenames (see @AlttiRi/twitter-click-and-save#filename-format) which include some additional information about the file it's not a problem, for example, to count downloaded posts (one post can have multiple files):
First list all files of certain author, for example, with [twitter] SpaceX
search and then:
// Parse the post ID from the filenames, then count the number of unique IDs.
new Set(
folder
.flat()
.filter(entry => entry.type === "file")
.map(entry => entry.name)
.map(name => {
const result = name.match(/\[twitter\] (?<author>.+)—(?<date>\d{4}\.\d{2}\.\d{2})—(?<postId>[^—]+)—(?<filename>.+)/);
if (result) {
return result.groups;
}
return null;
})
.filter(result => result)
.map(result => result.postId)
).size
This site is static, it has no backend to handle JSON snapshots, it handles them locally on your machine after your browser have downloaded the site's files (HTML, JS, CSS) hosted by GitHub Pages.
So, you do not send your personal data (JSON snapshots) anywhere. Also, the site has no analytics, it contains only code is necessary for the work.
But it's just a text. If you don't trust it (it's OK) to be sure that your data in safety (without verifying of the source code, or inspecting the site with DevTools) you can just open the site in an incognito window, disable the Internet access, work with the site, then close the incognito window and only after that enable the Internet access.
(Note: pako_inflate
library is lazy loaded, so you can open one of the demo snapshots first to load it, or just unpack your JSON snapshot from .gz
archive — in this case pako_inflate
is no needed.)
- Snap2HTML*
- Everything (with unchecked
Automatically remove offline volumes
option) - FilelistCreator
- Directory Snapshot
*Technically I also can generate a standalone HTML file as output, but do you need it?
There are a lot of things that to do to improve it. But the core functional is ready.