Skip to content

Releases: jina-ai/clip-as-service

Release v1.8.1

02 Feb 02:38
Compare
Choose a tag to compare

Highlights

  • support fp16
  • add build-in http server
  • add concurrent BertClient
  • fix python2 compatibility
  • support show tokenization result to client

Improvements

  • fix benchmark
  • fix RTD generation

Release v1.7.0

14 Jan 04:09
Compare
Choose a tag to compare

Highlights

  • Now support Windows! fixing logger serialization issue
  • Add dashboard for monitoring the service in real-time

Improvements

  • Fix timeout and zmq.linger problem on the client side
  • Add a new server option -mask_cls_sep for masking [CLS] and [SEP] before pooling
  • Fix communication logic to avoid KeyError on the server
  • Fix readme, documentation

Release v1.6.0

18 Dec 12:20
Compare
Choose a tag to compare

Highlights

  • Optimize for concurrency of multiple clients. Using multiple sockets between ventilator and workers, this prevents a super large job from a single client would queue up the worker making other clients hanging forever.
  • With new -device_map, you can specify which GPU you want to run on, and even mixing GPU/CPU workloads

Improvements

  • Add timeout to BertClient, would raise TimeoutError when server is not online
  • Refactor device_map on the server side, fixing bugs
  • Use zmq decorator to improve shutdown behavior
  • Add cosine distance explanation to FAQ
  • Update README for better readability

Release v1.5.5

14 Dec 13:20
Compare
Choose a tag to compare

Highlights

  • BERT graph is first freezed, optimized and stored as a self-contained file, then workers load from it (like model-export in tf-serving)
  • Customized tokenization is now supported
  • refactoring the BertWorker to make it more multiprocess-friendly
  • fix python2 pip install bug and encoding bug on BertClient.

Improvements

  • gather all zmq temp file in one place
  • improve the type check on the client side
  • add flag for cpu/gpu support
  • add flag for XLA support, seems no improvement

Release v1.5

10 Dec 06:38
5e80535
Compare
Choose a tag to compare

bert-as-service is now available on PyPI 🎆! From now on you can simply update the package by

pip install -U bert-serving-server bert-serving-client

No need to copy paste the client code again!

Highlights

  • fix async scheduling on the server-side #105 #101
  • fix masking in REDUCE_MEAN, REDUCE_MAX, and REDUCE_MEAN_MAX #93
  • add flag to support switching between GPU/CPU #111 #108
  • fix concurrent client building issue #110 #60
  • fix server hangs due to slow-joiner #110

Improvements

  • add visualization examples
  • improve readme by giving more screenshots and examples
  • fix path error in examples
  • add version check for client and server
  • refactor client handshake logic
  • update docker commands

Release v1.4

05 Dec 14:06
e9d675b
Compare
Choose a tag to compare

Highlights

  • Add masking in REDUCE_MEAN, REDUCE_MAX, REDUCE_MEAN_MAX. This shall affect the most when max_seq_len is much bigger than the actual sequence length from clients
  • Reduce latency significantly, improve speed by 20%, checkout new benchmark
  • Refactor async encoding, client needs to be upgraded to use the new version

Improvements

  • Reformat server logging
  • Update benchmark table and plot
  • Update readme

Release v1.3

03 Dec 03:58
Compare
Choose a tag to compare

Highlights

  • Add classification examples
  • Restrict GPU memory usage

Improvements

  • More comprehensive README
  • Add more examples
  • Add benchmark for pooling_layers

Release v1.2

26 Nov 12:44
de69080
Compare
Choose a tag to compare

Highlights:

  • fix random C-level assert error due to multi-thread in BertServer
  • redesign the message flow in BertServer and remove all back-chatter
  • server now opens two ports, one for pushing textual, the other for publishing encoded vector

Improvements:

  • fix/add more examples
  • fix figure in README.md
  • using JSON as serialization everywhere

Release v1.1

20 Nov 13:57
0689f99
Compare
Choose a tag to compare

Highlights

  • Support output of word embedding (by setting pooling_strategy=NONE)
  • Support different pooling strategies

Improvements

  • More comprehensive README
  • fix logger
  • fix wrong order problem when client_batch_size is large
  • fix conflict ipc address when start multiple server instances

Release v1.0

16 Nov 08:02
fce57a3
Compare
Choose a tag to compare

Highlights

  • Refactor the server-side pipeline and job scheduling, improve the scalability and reduce the latency significantly.
  • Optimize the serialization of numpy array between sockets
  • Add more exhaustive benchmark results.

Improvements

  • Client will show server configuration when first connect.
  • Better logging per worker per module
  • Fix typo and rich the content in README
  • Add dockerfile contributed by #12