Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: add docker troubleshooting when unable to connect to solr #8951

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 36 additions & 2 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,21 +108,55 @@ Note: please update this README with the exact wording of the error if you run i

`OSError: [Errno 12] Cannot allocate memory:` could occur in conjunction with `/openlibrary/openlibrary/core` or any number of files. Simply try increasing free RAM or increasing swap/page file/virtual memory for your operating system.


#### "No module named 'infogami'"

The following should populate the target of the `infogami` symbolic link (i.e. `vendor/infogami/`):
```
cd path/to/your/cloned/openlibrary
git submodule init; git submodule sync; git submodule update
```

Windows users may need to see [Fix line endings, symlinks, and git submodules](https://github.com/internetarchive/openlibrary/wiki/Git-Cheat-Sheet#fix-line-endings-symlinks-and-git-submodules-only-for-windows-users-not-using-a-linux-vm).

#### "no configuration file provided: not found" when running `docker compose <command>`

Ensure you're running `docker compose` commands from within the `local-openlibrary-dev-directory`.

### ConnectionError: HTTPConnectionPool(host='solr', port=8983)
The full error is something like (line breaks added):
```
/openlibrary/openlibrary/templates/home/index.html: error in processing
template: ConnectionError: HTTPConnectionPool(host='solr', port=8983):
Max retries exceeded with url: /solr/openlibrary/select (Caused by
NameResolutionError("<urllib3.connection.HTTPConnection object at
0x77a95c4e7f90>: Failed to resolve 'solr' ([Errno -2] Name or service
not known)")) (falling back to default template)
```
The following should get everything running again:
```sh
docker compose down
docker container ls -a
# If you see any openlibrary container here, remove them with `docker rm -f NAME`
docker network ls
# If you see any open library networks here, remove them with `docker network rm NAME
docker compose up # or docker compose up -d
```
If you're curious and want to understand what happened, and why the above likely fixes it, first, verify the `solr` container is running (e.g. `docker ps | grep solr`, and then look for something like `openlibrary-solr-1` that isn't `solr-updater`.) If the `solr` container isn't running, simply start it with `docker compose up solr` (or `docker compose up -d solr`) and that should fix it. If `solr` is running, verify too that you can also connect to solr at http://localhost:8983/solr/#/. If you can't, something else is likely wrong.

If the `solr` container is running and the error persists, one cause seems to be that the containers sometimes become disconnected from `openlibrary_webnet` (though this could happen with `openlibrary_dbnet` too). `openlibrary-web-1`/`web` should be connected to both `openlibrary_webnet` and `openlibrary_dbnet`, but when this problem occurs, instead only one is connected. E.g.:
```sh
docker container inspect --format '{{.NetworkSettings.Networks}}' openlibrary-web-1
# output: map[openlibrary_dbnet:0xc00037c1c0]
```
Because you've read this far, you can now directly fix the problem without removing the containers and networks. Simply reconnect the container to the network:
```
docker network connect openlibrary_webnet openlibrary-web-1 # or `openlibrary_dbnet` as the case may be.
docker container inspect --format '{{.NetworkSettings.Networks}}' openlibrary-web-1
# output: map[openlibrary_dbnet:0xc00016c460 openlibrary_webnet:0xc00016c540]
```
No restart is required. If `webnet` no longer exists, recreating it _should_ fix things: `docker network create openlibrary_webnet`.

To understand a bit more about what's going on here, there are docker networks configured in `compose.yaml`. The containers should be able to resolve one another based on the container names (e.g. `web` and `solr`), assuming `compose.yaml` has them on the same netork. For more, see [Networking in Compose](https://docs.docker.com/compose/networking/).

## Teardown commands

```sh
Expand Down
Loading