Skip to content

Commit

Permalink
0.0.4 Release -- ran test, seems to work for sample english texts
Browse files Browse the repository at this point in the history
  • Loading branch information
CypherousSkies committed Oct 27, 2021
1 parent 0463bad commit 16d63dd
Show file tree
Hide file tree
Showing 5 changed files with 21 additions and 20 deletions.
26 changes: 12 additions & 14 deletions .idea/workspace.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,14 @@ Unfortunately, for now I only have a cli which is only been tested on linux. Not
## Install

### Windows
The easiest way of doing this is by installing [WSL](https://docs.microsoft.com/en-us/windows/wsl/) with Ubuntu and follow the Ubuntu/debian instructions.
The "easiest" way of doing this is by installing [WSL](https://docs.microsoft.com/en-us/windows/wsl/) with Ubuntu and follow the Ubuntu/debian instructions.

If you're fancy and know how to python on windows, tell me how it goes and how you did it!

Note: unfortunately, it's hard to set up gpu stuff for WSL, and even then only really works for CUDA (NVIDIA) cards, which I have no way of testing as of now (not that I could test any gpu stuff now, but that's beyond the point).

### Mac
Gotta say, I have no idea how to get all the dependencies on mac. A cursory glance says that `brew` or `port` should be able to get most of them, but I have no idea about their availability. If you have a mac and figured this out, let me know how you did it!
Gotta say, I have no idea how to get all the dependencies (see ubuntu/debian) on mac. A cursory glance says that `brew` or `port` should be able to get most of them, but I have no idea about their availability. If you have a mac and figured this out, let me know how you did it!

### Ubuntu/Debian
`sudo apt install -y python3 python3-venv espeak ffmpeg tesseract-ocr-all python3-dev libenchant-dev libpoppler-cpp-dev pkg-config libavcodec libavtools ghostscript poppler-utils`
Expand Down Expand Up @@ -60,11 +60,11 @@ get [pytorch](https://pytorch.org)
Takes ~2-3GB of disk space for install

## Usage
`r4l [--in_path in/] [--out_path out/] [--lang en]` runs the suite of scanning and correction on all compatible files in the directory `in/` and outputs mp3 files to `out/` using the language `en`.
`r4l [--in_path in/] [--out_path out/] [--lang en]` runs the suite of scanning and correction on all compatible files in the directory `in/` and outputs mp3 files to `out/` using the language `en` (square brackets denoting optional parameters with default values).

Run `r4l --list_langs` to list supported languages

This program uses a lot of memory so I'd advise expanding your swap size by ~10GB (for debian use fixswap.sh)
~~This program uses a lot of memory so I'd advise expanding your swap size by ~10GB (for debian use `fixswap.sh`)~~ (This should be fixed now, but if it runs out of memory/crashes randomly, increase swap size)

### Benchmarks
On my current setup (4 intel i7 8th gen cores, no gpu, debian 10, 5gb ram+7gb swap) takes `0.124*(word count)-3.8` seconds (r^2=0.942,n=6), which is actually pretty good, clocking in at around 10 words per second with some overhead.
Expand Down
4 changes: 3 additions & 1 deletion r4l/util/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ def _write_to_file(self, wav, fname):
channels=1
)
audio.export(fout, format="mp3")
print(f"| > Wrote {fout}")
return fout, len(audio) / 1000

def tts(self, text, fname):
Expand Down Expand Up @@ -94,6 +95,7 @@ def tts(self, text, fname):
self._write_to_file(wav, fname + str(splits))
splits += 1
wav = None
if splits > 0:
audio = AudioSegment.silent()
print(f"> Collecting {splits} files to final mp3")
for i in tqdm(range(splits)):
Expand All @@ -102,7 +104,7 @@ def tts(self, text, fname):
os.remove(file)
audio_time = len(audio) / 1000
audio.export(self.outpath + fname + '.mp3', format='mp3')
elif splits == 0:
elif wav is not None and splits == 0:
file, audio_time = self._write_to_file(wav, fname)
else:
raise Exception("Somehow r4l.util.reader.wav is None")
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
readme = f.read()
setup(
name='reading4listeners',
version='0.0.4a2',
version='0.0.4',
packages=['r4l'],
url='https://github.com/CypherousSkies/reading-for-listeners',
project_urls={
Expand Down
1 change: 1 addition & 0 deletions time_data.csv
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
100325,216358.53539681435,32180.52498866213
15822,1996.113361120224,5214.462
8352,940.9961097240448,3071.005
14132,1816.9547145366669,4824.103

0 comments on commit 16d63dd

Please sign in to comment.