Skip to content
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.

"No such file or directory: 'generator/gpt2/models/model_v5/vocab.bpe'" #39

Open
JeffreyBenjaminBrown opened this issue Dec 8, 2019 · 64 comments
Labels
bug Something isn't working

Comments

@JeffreyBenjaminBrown
Copy link

When I go to this page and select "Run all", this is what I get:

AI Dungeon 2 will save and use your actions and game to continually improve AI
 Dungeon. If you would like to disable this enter 'nosaving' for any action.
 This will also turn off the ability to save games.

Initializing AI Dungeon! (This might take a few minutes)

Traceback (most recent call last):
  File "play.py", line 211, in <module>
    play_aidungeon_2()
  File "play.py", line 74, in play_aidungeon_2
    generator = GPT2Generator()
  File "/content/AIDungeon/AIDungeon/AIDungeon/AIDungeon/AIDungeon/generator/gpt2/gpt2_generator.py", line 27, in __init__
    self.enc = encoder.get_encoder(self.model_name, models_dir)
  File "/content/AIDungeon/AIDungeon/AIDungeon/AIDungeon/AIDungeon/generator/gpt2/src/encoder.py", line 111, in get_encoder
    with open(os.path.join(models_dir, model_name, 'vocab.bpe'), 'r', encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'generator/gpt2/models/model_v5/vocab.bpe'
@skryspin
Copy link

skryspin commented Dec 8, 2019

Yes, I also am getting that error.

@thelittlejunglemirza
Copy link

thelittlejunglemirza commented Dec 8, 2019

I was getting the same error and I used a different sign in now gives me this error:

Traceback (most recent call last):
  File "play_dm.py", line 48, in <module>
    play_dm()
  File "play_dm.py", line 21, in play_dm
    generator = GPT2Generator(temperature=0.9)
  File "/content/AIDungeon/generator/gpt2/gpt2_generator.py", line 27, in __init__
    self.enc = encoder.get_encoder(self.model_name, models_dir)
  File "/content/AIDungeon/generator/gpt2/src/encoder.py", line 109, in get_encoder
    with open(os.path.join(models_dir, model_name, 'encoder.json'), 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'generator/gpt2/models/model_v5/encoder.json'```

@azackoff2
Copy link

azackoff2 commented Dec 8, 2019

I am getting that error as well. That file definitely doesn't exist in the directory location that is specified.

@Jonblu11
Copy link

Jonblu11 commented Dec 8, 2019

I am also getting the error.

Initializing AI Dungeon! (This might take a few minutes)

Traceback (most recent call last):
  File "play.py", line 211, in <module>
    play_aidungeon_2()
  File "play.py", line 74, in play_aidungeon_2
    generator = GPT2Generator()
  File "/content/AIDungeon/AIDungeon/generator/gpt2/gpt2_generator.py", line 27, in __init__
    self.enc = encoder.get_encoder(self.model_name, models_dir)
  File "/content/AIDungeon/AIDungeon/generator/gpt2/src/encoder.py", line 111, in get_encoder
    with open(os.path.join(models_dir, model_name, 'vocab.bpe'), 'r', encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'generator/gpt2/models/model_v5/vocab.bpe'

@Autistic-Unicorn
Copy link

Autistic-Unicorn commented Dec 8, 2019

``12/08` 01:51:06 [NOTICE] Downloading 1 item(s)

12/08 01:51:06 [ERROR] CUID#7 - Download aborted. URI=http:///model_v5/model-550.data-00000-of-00001
Exception: [AbstractCommand.cc:351] errorCode=22 URI=http://
/model_v5/model-550.data-00000-of-00001
-> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=403

12/08 01:51:06 [NOTICE] Download GID#8f01c46017c97048 not complete:

Download Results:
gid |stat|avg speed |path/URI
======+====+===========+=======================================================
8f01c4|ERR | 0B/s|http://**/model_v5/model-550.data-00000-of-00001

Status Legend:
(ERR):error occurred.

aria2 will resume download if the transfer is restarted.
If there are any errors, then see the log file. See '-l' option in help/man page for details.
--2019-12-08 01:51:06-- http://**/model_v5/checkpoint
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-08 01:51:06 ERROR 403: Forbidden.

--2019-12-08 01:51:06-- http://**/model_v5/encoder.json
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-08 01:51:06 ERROR 403: Forbidden.

--2019-12-08 01:51:06-- http://**/model_v5/hparams.json
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-08 01:51:07 ERROR 403: Forbidden.

--2019-12-08 01:51:07-- http://**/model_v5/model-550.index
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-08 01:51:07 ERROR 403: Forbidden.

--2019-12-08 01:51:07-- http://**/model_v5/model-550.meta
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14316502 (14M) [application/octet-stream]
Saving to: ‘model-550.meta’

model-550.meta 100%[===================>] 13.65M --.-KB/s in 0.07s

2019-12-08 01:51:07 (190 MB/s) - ‘model-550.meta’ saved [14316502/14316502]

--2019-12-08 01:51:07-- http://**/model_v5/vocab.bpe
Connecting to 130.211.31.182:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-08 01:51:07 ERROR 403: Forbidden.``

There is 403 error while downloading and this causes errors while initializing the AI Dungeon.
I "**" the ip adresses since i don't know if it save to share them.

@ghost
Copy link

ghost commented Dec 8, 2019

you beat me to it - I think they probably turned off downloading the necessary models. How big are they?

@ghost
Copy link

ghost commented Dec 8, 2019

Same here

@ghost
Copy link

ghost commented Dec 8, 2019

Looks like Nick updated the download URL - getting "Network is unreachable" now

Anyone able to set up a mirror?

@ghost
Copy link

ghost commented Dec 8, 2019

@nickwalton how big are the files that need to be hosted?

@JillCrungus
Copy link

A recent commit changed the download url in the install.sh file to an IP that returns 403 errors.

A commit that was just pushed changed the URL again to https://students.cs.byu.edu/~nickwalt but this site doesn't work either.

You can temporarily hotfix it on your end by changing the download_url back to the original, https://aidungeonmodel.s3-us-west-1.amazonaws.com until Nick can sort it out

@ghost
Copy link

ghost commented Dec 8, 2019

Files don't seem that big, why not just host them on the git repo?

@nickwalton
Copy link
Contributor

I think I just fixed this issue. Sorry for the craziness guys. It was just costing INSANE amounts. But should be reasonable now. Let me know if it still doesn't work.

@ghost
Copy link

ghost commented Dec 8, 2019

ahh, the apt-get install of aria2 failed because running locally it needed sudo. the errors weren't visible when running the install script.

I see now that its 5.8 GiB

This is downloaded everytime someone runs this in that notebook, or is there caching? Sorry still downloading right now locally to my machine, haven't retried the notebook.

@JeffreyBenjaminBrown
Copy link
Author

@nickwalton -- Details, please! What was the cost per game before the fix? The cost now? How many people are playing it?

@JeffreyBenjaminBrown
Copy link
Author

And is it your personal money, or the university's?

@ghost
Copy link

ghost commented Dec 8, 2019

I'm going to make an issue for this problem

@ghost
Copy link

ghost commented Dec 8, 2019

I think I just fixed this issue. Sorry for the craziness guys. It was just costing INSANE amounts. But should be reasonable now. Let me know if it still doesn't work.

Cloning into 'AIDungeon'...
remote: Enumerating objects: 48, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 48 (delta 2), reused 22 (delta 0), pack-reused 0
Unpacking objects: 100% (48/48), done.
/content/AIDungeon
Downloading AIDungeon2 Model... (this may take a few minutes)

12/08 03:34:37 [NOTICE] Downloading 1 item(s)

12/08 03:34:37 [ERROR] CUID#13 - Download aborted. URI=http://130.211.31.182:80/model_v5/model-550.data-00000-of-00001
Exception: [AbstractCommand.cc:351] errorCode=22 URI=http://130.211.31.182:80/model_v5/model-550.data-00000-of-00001
-> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=403

Additionally, the workaround now fails with:
https://aidungeonmodel.s3-us-west-1.amazonaws.com/model_v5/hparms.json
returning:

<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>A1CC78CD07CD6853</RequestId>
<HostId>
8Coije0L8KTLMiWRGinwdYfzmmc6YzwAF/zrvtNxdtC1RlJg02MY2BVP5iRIWJqBGyMeB66jo+g=
</HostId>
</Error>

@anonymousUsers1
Copy link

Running into the same issue, resulting in the following message:

`AI Dungeon 2 will save and use your actions and game to continually improve AI
Dungeon. If you would like to disable this enter 'nosaving' for any action.
This will also turn off the ability to save games.

Initializing AI Dungeon! (This might take a few minutes)

Traceback (most recent call last):
File "play.py", line 211, in
play_aidungeon_2()
File "play.py", line 74, in play_aidungeon_2
generator = GPT2Generator()
File "/content/AIDungeon/AIDungeon/generator/gpt2/gpt2_generator.py", line 29, in init
with open(os.path.join(models_dir, self.model_name, 'hparams.json')) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'generator/gpt2/models/model_v5/hparams.json'`

@ghost
Copy link

ghost commented Dec 8, 2019

@nickwalton -- Details, please! What was the cost per game before the fix? The cost now? How many people are playing it?

13.34¢ USD per run with current S3 prices, which doesn't sound like a lot, but according to Google, searches for "AI Dungeon 2" are quickly coming close to eclipsing searches for "World of Warcraft" to give you an idea of how much traffic this is getting. It's been republished by a ton of large tech and gaming sites/blogs/news aggregators.

image

@anonymousUsers1
Copy link

anonymousUsers1 commented Dec 8, 2019

It's been republished by a ton of large tech and gaming sites/blogs/news aggregators.

The futurism.com article about this project appeared on my google chrome home screen a few hours ago, so that may lead to a larger influx soon.

@nickwalton nickwalton reopened this Dec 8, 2019
@nickwalton
Copy link
Contributor

Turns out fix wasn't good enough. To answer your questions cost was something like 20-30 cents per download. Got 60,000 unique users to the aidungeon.io site and the charges ended up at 15k for just today. Dr. Wingate, the professor of my lab was sponsoring it, but it's gotten past what he can afford from the labs budget so I had to shut off public bucket access till there is a solution. If anyone wants to set up a torrent system or something then I'm happy to support

@snglth
Copy link

snglth commented Dec 8, 2019

IPFS may be a way to handle the model distribution

@JeffreyBenjaminBrown
Copy link
Author

Omg. You guys are amazing.

What's a run? If someone selects "restart and run all", is that a new run? I may have done that 40 times today.

@ghost
Copy link

ghost commented Dec 8, 2019

i'd gladly help seed any torrents!

@snglth
Copy link

snglth commented Dec 8, 2019

Torrents should be fine too. You can feed a magnet link to aria2c to download files in colab. Someone who already has model files just need to register torrent file on some tracker.

@ghost
Copy link

ghost commented Dec 8, 2019

i'd gladly help seed any torrents!

Same, though I suspect Nick probably doesn't want to spend the time re-implementing this download over a bittorrent client inside of a shell script.

If it's any consideration, I know OVH does storage at 1.1¢ per GB, which works out to only about 6.38¢/run, which would decrease your costs significantly, but at the levels of bandwidth we're talking about, I'm thinking the most cost effective option would be to host the game as it's own service on a dedicated server with plenty of processing power.

@ghost
Copy link

ghost commented Dec 8, 2019

@InfosecRD see above, aria2c can do magnet links

@ghost
Copy link

ghost commented Dec 8, 2019

@InfosecRD see above, aria2c can do magnet links

I'm no expert on the colab platform, but if this is implemented, I'll help seed from my home connection and on my VPS :)

@nickwalton
Copy link
Contributor

nickwalton commented Dec 8, 2019 via email

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

I've got the folder shared on my google drive too, this way you can mount the file directly to avoid copying: https://drive.google.com/drive/folders/1XiDD2BD8vLZaJxZpCrNYjscpvnD3EYrP

@stylemistake
Copy link
Contributor

stylemistake commented Dec 8, 2019

mount the file directly to avoid copying

Brilliant idea, Nick might in fact use his own GDrive to have more control over his models.

@Portaluke
Copy link

mount the file directly to avoid copying

Brilliant idea, Nick might in fact use his own GDrive to have more control over his models.

Out of curiosity, would this make the torrent obsolete as far as the progress we made on it?

@nickwalton
Copy link
Contributor

nickwalton commented Dec 8, 2019 via email

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

They'd have to click the link manually first to get it as a shared folder under their drive, and then they mount their own drive.
https://stackoverflow.com/questions/53576555/share-a-part-of-google-drive-on-colab

@JushBJJ
Copy link
Contributor

JushBJJ commented Dec 8, 2019

Cloning into 'AIDungeon'...
remote: Enumerating objects: 48, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 48 (delta 2), reused 20 (delta 0), pack-reused 0
Unpacking objects: 100% (48/48), done.
/content/AIDungeon/AIDungeon
Downloading AIDungeon2 Model... (this may take a few minutes)
--2019-12-08 06:48:35--  https://github.com/nickwalton/AIDungeon/files/3935881/model_v5.torrent.zip
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-repository-file-5c1aeb.s3.amazonaws.com/179196443/3935881?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20191208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20191208T064836Z&X-Amz-Expires=300&X-Amz-Signature=66861f1a4a2ae662d4bfd35f90081cf799b4a456004efe3bebbefd32c2135cb9&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3Bfilename%3Dmodel_v5.torrent.zip&response-content-type=application%2Fzip [following]
--2019-12-08 06:48:36--  https://github-production-repository-file-5c1aeb.s3.amazonaws.com/179196443/3935881?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20191208%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20191208T064836Z&X-Amz-Expires=300&X-Amz-Signature=66861f1a4a2ae662d4bfd35f90081cf799b4a456004efe3bebbefd32c2135cb9&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3Bfilename%3Dmodel_v5.torrent.zip&response-content-type=application%2Fzip
Resolving github-production-repository-file-5c1aeb.s3.amazonaws.com (github-production-repository-file-5c1aeb.s3.amazonaws.com)... 52.217.37.204
Connecting to github-production-repository-file-5c1aeb.s3.amazonaws.com (github-production-repository-file-5c1aeb.s3.amazonaws.com)|52.217.37.204|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 60573 (59K) [application/zip]
Saving to: ‘model_v5.torrent.zip’

model_v5.torrent.zi 100%[===================>]  59.15K   162KB/s    in 0.4s    

2019-12-08 06:48:37 (162 KB/s) - ‘model_v5.torrent.zip’ saved [60573/60573]

Archive:  model_v5.torrent.zip
  inflating: model_v5.torrent        

12/08 06:48:37 [NOTICE] Downloading 1 item(s)

12/08 06:48:37 [ERROR] Exception caught while loading DHT routing table from /root/.cache/aria2/dht.dat
Exception: [DHTRoutingTableDeserializer.cc:83] errorCode=1 Failed to load DHT routing table from /root/.cache/aria2/dht.dat

12/08 06:48:37 [NOTICE] IPv4 DHT: listening on UDP port 6926

12/08 06:48:37 [NOTICE] IPv4 BitTorrent: listening on TCP port 6900

12/08 06:48:37 [ERROR] IPv6 BitTorrent: failed to bind TCP port 6900
Exception: [SocketCore.cc:312] errorCode=1 Failed to bind a socket, cause: Name or service not known

I did this in the colab website.

@stylemistake
Copy link
Contributor

This error is misleading, because it simply fails to bind to IPv6 port, but it was downloading fine over IPv4.

@ghost
Copy link

ghost commented Dec 8, 2019

I think the Google drive idea might be better in practice

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

I'm working on the Google drive symlink workaround, should be ready soon.

@WAUthethird
Copy link

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

Google Drive workaround: https://colab.research.google.com/drive/1OjBQe4H4C2s-p4-OeJoXw5DStIjPy2VS

Why does it copy the files? Doesn't the shared folder work fine?

@WAUthethird
Copy link

Google Drive workaround: https://colab.research.google.com/drive/1OjBQe4H4C2s-p4-OeJoXw5DStIjPy2VS

Why does it copy the files? Doesn't the shared folder work fine?

No, because this method requires mounting the user's Google Drive.

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

@WAUthethird So I don't quite understand. Why would you download from your google drive instead of mounting your google drive?

@WAUthethird
Copy link

@WAUthethird So I don't quite understand. Why would you download from your google drive instead of mounting your google drive?

I'm not quite sure what you mean. It copies the files after the drive has been mounted.

@Akababa
Copy link
Contributor

Akababa commented Dec 8, 2019

@WAUthethird Shared folders don't take up space, and the colab instance uses a symbolic link not a copy. Just checked my google drive usage on my test account and it hasn't changed.

@WAUthethird
Copy link

WAUthethird commented Dec 8, 2019

@WAUthethird Shared folders don't take up space, and the colab instance uses a symbolic link not a copy. Just checked my google drive usage on my test account and it hasn't changed.

Ah, I see. I think I was recalling the instances where I got "Quota Exceeded" on a few files and had to make a copy, which did take up space. I've updated the instructions.

@ghost
Copy link

ghost commented Dec 8, 2019

I'm seeing an input/output error attempting to run with the workaround link that has the uncensored version. I think I added it to my drive correctly?

tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: generator/gpt2/models/model_v5/model-550.data-00000-of-00001; Input/output error
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[save/RestoreV2/_301]]
  (1) Unknown: generator/gpt2/models/model_v5/model-550.data-00000-of-00001; Input/output error
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

@ghost
Copy link

ghost commented Dec 8, 2019

I'll continue to work on getting shards on up github. Probably take a few hours to upload. I think the script that does it is just about ready

@ghost
Copy link

ghost commented Dec 8, 2019

anyone able to get the google drive shared folder method working?

Also it looks like the notebook reset itself. Things are only cached per session. That's a lot of re-downloading for anyone who wants to use this more than once.

Also it looks like all of my github hosted shards are up. I'll write a script to pull them down and piece them together.

@nickwalton
Copy link
Contributor

nickwalton commented Dec 8, 2019 via email

@ghost
Copy link

ghost commented Dec 8, 2019

😬 given the demand I can't imagine an app version would be cheap (assuming you mean web-app), especially since this seems to require a high end GPU. With colab google is basically giving away their product at a loss to get people to use it. Plus also isn't there a state you'd need to keep track of for every session? There's a lot of problems to solve.

@stylemistake
Copy link
Contributor

stylemistake commented Dec 8, 2019

Torrents work amazingly well now, with downloads averaging to 2MiB/s per client.

@nickwalton perhaps you should update the Colab page with information about the torrent, encouraging users to get it and seed it, and bring back the setup instructions. Unless of course, you are already fully immersed into making an official web app, in which case, i wish the best of luck 👍

Web development is a time-consuming job.

@ben-bay ben-bay added the bug Something isn't working label Dec 9, 2019
@Valareos
Copy link

If its a matter of having a server to store and serve files on, may I suggest using Hetzner? they offer fully self managed root access servers. I have had no issue with them on maintenance, and the one time that the server was unstable due to unknown hardware issue, they just swapped out the hard drives to new hardware.

You can bid on one of their used systems https://www.hetzner.com/sb At the moment they got a intel i7-2600 server with 2x3 TB Raid 1 capacity, and unlimited 1GB/s traffic for 32.27 Euro a month. They have different distros of linux to choose from, or windows if you want to pay a bit more for the license.

This way you got full control over the files themselves.

@Valareos
Copy link

If its a matter of having a server to store and serve files on, may I suggest using Hetzner? they offer fully self managed root access servers. I have had no issue with them on maintenance, and the one time that the server was unstable due to unknown hardware issue, they just swapped out the hard drives to new hardware.

You can bid on one of their used systems https://www.hetzner.com/sb At the moment they got a intel i7-2600 server with 2x3 TB Raid 1 capacity, and unlimited 1GB/s traffic for 32.27 Euro a month. They have different distros of linux to choose from, or windows if you want to pay a bit more for the license.

This way you got full control over the files themselves.

I run a crypto currency wallet and pool off of mine. If it can handle that, it can handle your needs :)

@stylemistake
Copy link
Contributor

stylemistake commented Dec 10, 2019

1Gb/s bandwidth

Trust me, no standalone server would be capable of dealing with this demand. We currently have about 1000 torrent leechers, and just calculate how much of that bandwidth would be left to each client. 1000 Mbit/s -> 125 MB/s (best case) -> 125kB/s/client.

@jonahsnider
Copy link

Google Cloud Platform has a generous $300 free credit to spend over the course of one year. You could rent a VPS temporarily to use as a CDN. This has the benefit of potentially being hosted in the same datacenter as Google Colaboratory, which could enable very fast local transfers over HTTP.

@Valareos
Copy link

Valareos commented Dec 10, 2019

1Gb/s bandwidth

Trust me, no standalone server would be capable of dealing with this demand. We currently have about 1000 torrent leechers, and just calculate how much of that bandwidth would be left to each client. 1000 Mbit/s -> 125 MB/s (best case) -> 125kB/s/client.

They offer cloud service too. Their highest standard one is 35.58 euro a month that has 8 vcpu, 32 GB ram, 240 GB storage, and 20 tb/month. Would be cheaper to scale that up as needed than pay a per instance cost. 15k in one day would have paid for over 450 servers serving a total of 9 PB of data for a month. I don't think you need that much.

Cheapest option is 2.99 euro a month, 1 vcpu, 2gb ram 20 gb storage and 20TB/month, so you can pick in that range for what your file hosting section needs.

LaTueur pushed a commit to LaTueur/AIDungeon that referenced this issue Dec 20, 2019
Update help text to match actual 'reset', 'restart' behavior
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests