diff --git a/README.md b/README.md index 2f9e81d8..457e737a 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@
+
# Unstract
@@ -18,13 +18,13 @@ They also contain helper methods/classes to aid with other tasks such as indexin
- The below libraries need to be installed to run the SDK
- Linux
- ```
+ ```bash
sudo apt install build-essential pkg-config libmagic-dev
```
- Mac
- ```
+ ```bash
brew install pkg-config libmagic pandoc tesseract-ocr
```
@@ -60,6 +60,7 @@ Index Version **0.9.28** as on January 14th, 2024
### Developing with the SDK
Ensure that you have all the required dependencies and pre-commit hooks installed
+
```shell
uv sync
pre-commit install
@@ -68,6 +69,7 @@ pre-commit install
Once the changes have been made, it can be tested with [Unstract](https://github.com/Zipstack/unstract) through the following means.
#### With UV
+
Specify the SDK as a dependency to a project using a tool like `uv` by adding the following to your `pyproject.toml`
```toml
@@ -85,19 +87,25 @@ unstract-sdk = { path = "${UNSTRACT_SDK_PATH", editable = true }
```
#### With pip
+
- If the project is using `pip` it might be possible to add it as a dependency in `requirements.txt`
-```
+
+```shell
-e /path/to/unstract-sdk
```
+
NOTE: Building locally might require the below section to be replaced in the `unstract-sdk`'s build system configuration
-```
+
+```toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
```
+
- Another option is to provide a git URL in `requirements.txt`, this can come in handy while building tool
docker images. Don't forget to run `apt install git` within the `Dockerfile` for this
-```shell
+
+```toml
[tool.uv.sources]
unstract-sdk = { git = "git+https://github.com/Zipstack/unstract-sdk@feature-branch" }
```
@@ -105,15 +113,18 @@ unstract-sdk = { git = "git+https://github.com/Zipstack/unstract-sdk@feature-bra
- Or try installing a [local PyPI server](https://pypi.org/project/pypiserver/) and upload / download your package from this server
#### Additonal dependencies for tool
-Tools may need to be backed up by a file storage. unstract.sdk.file_storage contains the required interfaces for the
-same. fssepc is being used underneath to implement these interfaces. Hence, one can choose to use a file_system
+
+Tools may need to be backed up by a file storage. `unstract.sdk.file_storage` contains the required interfaces for the
+same. `fsspec` is being used underneath to implement these interfaces. Hence, one can choose to use a file system
supported by fsspec for this. However, the required dependencies need to be added in the tool dependency manager.
-Eg. If the tool is using Minio as the underlying file storage, then s3fs can be added to support it.
-Similarly, for Google Cloud Storage, gcsfs is to be added.
+Eg. If the tool is using Minio as the underlying file storage, then `s3fs` can be added to support it.
+Similarly, for Google Cloud Storage, `gcsfs` needs to be added.
The following versions are tested in the SDK using unit test cases for the above package.
- gcsfs==2024.10.0
- s3fs==2024.10.0
+```toml
+ gcsfs==2024.10.0
+ s3fs==2024.10.0
+```
### Documentation generation
diff --git a/docs/assets/unstract_u_logo.png b/docs/assets/unstract_u_logo.png
index de7ed623..db21db58 100644
Binary files a/docs/assets/unstract_u_logo.png and b/docs/assets/unstract_u_logo.png differ