Skip to content

Latest commit

 

History

History
186 lines (168 loc) · 20.1 KB

Release-Notes.md

File metadata and controls

186 lines (168 loc) · 20.1 KB
layout title group priority
global
Release Notes
Overview
7

June 23, 2021

This is the first release on the Alluxio 2.6.X line.

Due to the improvements in the 2.5 release, a much larger number of users were able to leverage Alluxio’s ability to accelerate AI/ML workloads as a Data Orchestration framework. We have taken the feedback and contributions from the community and further improved the end-to-end AI/ML workflows with Alluxio. This has taken the form of simplified deployment, data management capabilities, performance optimizations, and enhanced visibility into system states.

The improvements in Alluxio 2.6 further enable the recommended architecture of running AI/ML workloads with Alluxio on any storage, and also improves the general maintainability and visibility of Alluxio clusters, especially for large scale deployments.

  • Table of Contents {:toc}

Highlights

Unified Alluxio Worker and FUSE Process

Alluxio 2.6 supports running FUSE as a part of a worker process ([documentation]({{ '/en/api/POSIX-API.html#fuse-on-worker-process' | relativize_url }})). Coupling the two helps reduce interprocess communication, which is especially evident in AI/ML workloads where file sizes are small and RPC overheads make up a significant portion of the I/O time. In addition, containing both components in a single process greatly improves the deployability of the software in containerized environments, such as Kubernetes.

Improved Distributed Load

Alluxio 2.6 improves the efficiency of distributed load and its user experience ([documentation]({{ '/en/operation/User-CLI.html#distributedload' | relativize_url }})), in addition to more traceability and metrics for easier operability. Distributed load command is a key portion commonly used for many users to pre-load data and speed up subsequent training or analytics jobs.

Instrumentation Framework

Alluxio 2.6 adds a large set of metrics and traceability features enabling users to drill into the system’s operating state ([documentation]({{ '/en/reference/Metrics-List.html' | relativize_url }})). These range from aggregated throughput of the system to summarized metadata latency when serving client requests. The new metrics can be used to measure the current serving capacity of the system and identify potential resource bottlenecks. Request level tracing and timing information can also be obtained for deep performance analysis.

Improvements Since 2.5.0

  • Add and improve metrics (8069336{:target="_blank"}) (42729d3{:target="_blank"}) (a7f5349{:target="_blank"}) (b689333{:target="_blank"}) (a2fdf20{:target="_blank"}) (abfd6e1{:target="_blank"}) (e0478e8{:target="_blank"}) (82d155c{:target="_blank"}) (49bc315{:target="_blank"}) (6966476{:target="_blank"}) (35c4189{:target="_blank"}) (942e905{:target="_blank"}) (00867dd{:target="_blank"}) (2202370{:target="_blank"})
  • Improve log and error message (647c671{:target="_blank"}) (c83557a{:target="_blank"}) (030968d{:target="_blank"}) (97faa15{:target="_blank"}) (7b4abe1{:target="_blank"}) (fe561c3{:target="_blank"}) (25e1882{:target="_blank"})
  • Improve test coverage and code stability (431cafa{:target="_blank"}) (308992c{:target="_blank"}) (702e061{:target="_blank"}) (22c2e51{:target="_blank"}) (9e3e9df{:target="_blank"})
  • Improve Rocks metastore (0203366{:target="_blank"}) (ceaaef4{:target="_blank"})
  • Update dependency version (abe1eec{:target="_blank"}) (b96b035{:target="_blank"}) (be4ed06{:target="_blank"}) (37bf0ad{:target="_blank"}) (dcfe0c0{:target="_blank"}) (830d4a8{:target="_blank"}) (a581793{:target="_blank"}) (bebe53e{:target="_blank"}) (ee131ef{:target="_blank"}) (7cc0982{:target="_blank"}) (ef00fcd{:target="_blank"})
  • Improve code style and structure (ab17c50{:target="_blank"}) (91f33de{:target="_blank"}) (4d40d65{:target="_blank"}) (14155df{:target="_blank"}) (27d9b2a{:target="_blank"}) (498ef0f{:target="_blank"}) (681076f{:target="_blank"})
  • Add or improve configurations (a272616{:target="_blank"}) (bccbda8{:target="_blank"}) (aab820c{:target="_blank"}) (ebe719b{:target="_blank"}) (d4890f9{:target="_blank"}) (f0602ec{:target="_blank"}) (33f2697{:target="_blank"})
  • Improve distributed job commands (c939e33{:target="_blank"}) (281430f{:target="_blank"}) (a07403f{:target="_blank"}) (2b429fc{:target="_blank"}) (fc5a280{:target="_blank"}) (1f2c0fd{:target="_blank"}) (e3fd127{:target="_blank"})
  • Improve worker registration (0374c9e{:target="_blank"}) (d4f01ef{:target="_blank"}) (412524c{:target="_blank"}) (23a31c0{:target="_blank"}) (7fb8409{:target="_blank"})
  • Improve Kubernetes integration (8fd77c4{:target="_blank"}) (7964c9f{:target="_blank"}) (fac8df1{:target="_blank"})
  • Enhance worker side RPC handling (4395b54{:target="_blank"})
  • Move container level securityContext to pod level in Kubernetes (076248d{:target="_blank"})
  • Improve the metadata sync (d41fb8d{:target="_blank"})
  • Support stop master process on demotion (72fb5d4{:target="_blank"})
  • Improve journal and add monitor for journal space usage (a067b94{:target="_blank"}) (ae96f32{:target="_blank"})
  • Improve block reader (3061a69{:target="_blank"})
  • Improve Worker Interface and implementation (2ad1e1e{:target="_blank"})
  • Implement nondeterministic LRU (feb4aa0{:target="_blank"})
  • Support open file for override in Fuse (55cc3b7{:target="_blank"})
  • Use Generic type to make CreateFileEntry extensible (f8fe02d{:target="_blank"})
  • Add hostAliases in Masters and Worker Pods (9c64bf0{:target="_blank"})
  • Reduce Mem Copy in client (32f64b6{:target="_blank"})
  • Reduce unnecessary creation of breadcrumbs (d946128{:target="_blank"})
  • Improve Web UI (3cfa340{:target="_blank"}) (a734955{:target="_blank"}) (71bdf3d{:target="_blank"})
  • Support CephFS direct and native as underfs of Alluxio (3ef4728{:target="_blank"})
  • Prevent loading child inode metadata when not required (6f18745{:target="_blank"})
  • Support CephFS as underfs of Alluxio (d46003d{:target="_blank"})
  • Change default logserver PVC selectors to empty (f9cc6fb{:target="_blank"})
  • Make connection pools per connection instead of per db for hive udb (d6d7e78{:target="_blank"})
  • Support bypass table when attach and sync db (ef55f4a{:target="_blank"})

Bugfixes Since 2.5.0

  • Fix regression in AbsetUfsPathCache (55a3b28{:target="_blank"})
  • Open the source file before creating target file during migrate (c74924d{:target="_blank"})
  • Add a boundary check before Fuse read (3aaf75b{:target="_blank"})
  • Add resource loading fallback logic to ExtensionClassLoader (342f17f{:target="_blank"})
  • Fix potential directBuffer leak (6231603{:target="_blank"})
  • Fix listStatus on open file stream will complete the stream unexpectedly (eeb4193{:target="_blank"})
  • Fix the rejection of legitimate async caching requests because of the fulfill of queue (1e7c2f7{:target="_blank"})
  • Cancel sync job in middle of ufs calls if client cancelled (a906c87{:target="_blank"})
  • Fix Fuse write then read problem in async release and support umount (b1b3f40{:target="_blank"})
  • Prevent NPE for AbstractFileSystemShellTest (4c4ec92{:target="_blank"})
  • Fix helm-chart Fuse hostpath type from File to CharDevice (34eb7cb{:target="_blank"})
  • Fix master crash due to journal flush error (70d6d0b{:target="_blank"})
  • Fix the worker hang issue (c668ae8{:target="_blank"})
  • Add null check for bucket column in Glue UDB (3618aa2{:target="_blank"})
  • Fix alluxio-fuse stat for Alluxio-Worker-FUSE (386a9f4{:target="_blank"})
  • Improve Ceph InputStream to be thread safe (265c24f{:target="_blank"})
  • Handle client cache failure on "no space left on device" (89f8971{:target="_blank"})
  • Fix replicate job loading non-persisted files (ce249e3{:target="_blank"})
  • Fix a regression of uncleaned directories (6ae85e6{:target="_blank"})
  • Fix process local block in stream (d605580{:target="_blank"})
  • Fix packaging distributed load code into DistributedLoadUtils (40af661{:target="_blank"})
  • Fix instrumentation setup (837a41c{:target="_blank"})
  • Fix xstream security issue (78f1ec2{:target="_blank"})
  • Fix throw CCE when using Fuse while enable client cache bug (f34b615{:target="_blank"})
  • Fix possible NullPointerException when calling e.getCause (81fe34e{:target="_blank"})
  • Fix NPE in Allocator and add some UT for Reviewer (f8518d0{:target="_blank"})
  • Fix variable reference and exit status (8de3ea2{:target="_blank"})

Acknowledgements

We want to thank the community for their valuable contributions to the Alluxio 2.6.0 release. Especially, we would like to thank:

Arthur Jenoudet (jenoudet{:target="_blank"}), Baolong Mao (maobaolong{:target="_blank"}), Bing Zheng (bzheng888{:target="_blank"}), Binyang Li (Binyang2014{:target="_blank"}), Ce Zhang (JySongWithZhangCe{:target="_blank"}), Chenliang Lu (yabola{:target="_blank"}), Jinpeng Chi (cutiechi{:target="_blank"}), Daniel Pham (danielcpham{:target="_blank"}), Eugene Ma (Eugene-Mark{:target="_blank"}), Haifeng Wang (yuexingri{:target="_blank"}), Jieliang Li (ljl1988com{:target="_blank"}), ja725, Junfan Zhang (zuston{:target="_blank"}), leewish, Mingchao Zhao (captainzmc{:target="_blank"}), Pan Liu (liupan664021{:target="_blank"}), Peter Roelants (horasal{:target="_blank"}), Pramesh Gupta (pramesh94{:target="_blank"}), Xiang Chen (cdmikechen{:target="_blank"}), Zhenwei Wu (wuzhenwei-xx{:target="_blank"}), Xiang Li (waterlx{:target="_blank"}), Yantao Xue (jhonxue{:target="_blank"}), Yue Shao (ys270), Yun Wang, Zac Blanco (ZacBlanco{:target="_blank"}), and Zhenyu Song (sunnyszy{:target="_blank"})

Enjoy the new release and look forward to hearing your feedback on our Community Slack Channel.