Skip to content

Releases: hdinsight/release-notes

Azure HDInsight release notes

03 Mar 02:43
74b8eeb
Compare
Choose a tag to compare

Azure HDInsight release notes

This article provides information about the most recent Azure HDInsight release updates.

Summary

Azure HDInsight is one of the most popular services among enterprise customers for open-source analytics on Azure.
Subscribe to our release notes and watch releases on this GitHub repository.

Release date: February 28, 2023

This release applies to HDInsight 4.0. 5.0, and 5.1. HDInsight release will be available to all regions over several days. This release is applicable for image number 2302250400. How to check the image number?

HDInsight uses safe deployment practices, which involve gradual region deployment. it may take up to 10 business days for a new release or a new version to be available in all regions.

OS versions

  • HDInsight 4.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
  • HDInsight 5.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4

For workload specific versions, see

HDInsight 4.x component versions
HDInsight 5.x component versions

Note:
Microsoft has issued CVE-2023-23408, which is fixed on the current release and customers are advised to upgrade their clusters to latest image.

What's new?

HDInsight 5.1

We have started rolling out a new version of HDInsight 5.1. All new open-source releases added as incremental releases on HDInsight 5.1.

For more information, see HDInsight 5.x version

Kafka 3.2.0 Upgrade (Preview)

  • Kafka 3.2.0 includes several significant new features/improvements.
    • Upgraded Zookeeper to 3.6.3
    • Kafka Streams support
    • Stronger delivery guarantees for the Kafka producer enabled by default.
    • log4j 1.x replaced with reload4j.
    • Send a hint to the partition leader to recover the partition.
    • JoinGroupRequest and LeaveGroupRequest have a reason attached.
    • Added Broker count metrics8.
    • Mirror Maker2 improvements.

HBase 2.4.11 Upgrade (Preview)

  • This version has new features such as the addition of new caching mechanism types for block cache, the ability to alter hbase:meta table and view the hbase:meta table from the HBase WEB UI.

Phoenix 5.1.2 Upgrade (Preview)

  • Phoenix version upgraded to 5.1.2 in this release. This upgrade includes the Phoenix Query Server. The Phoenix Query Server proxies the standard Phoenix JDBC driver and provides a backwards-compatible wire protocol to invoke that JDBC driver.

Ambari CVEs

  • Multiple Ambari CVEs are fixed.

Note:
ESP isn't supported for Kafka and HBase in this release.

End of support

End of support for Azure HDInsight clusters on Spark 2.4 February 10, 2024. For more information, see Spark versions supported in Azure HDInsight

Coming soon

  • Autoscale
    • Autoscale with improved latency and several improvements
  • Cluster name change limitation
    • The max length of cluster name will be changed to 45 from 59 in Public, Mooncake and Fairfax.
  • Cluster permissions for secure storage
    • Customers can specify (during cluster creation) whether a secure channel should be used for HDInsight cluster nodes to contact the storage account.
  • Non-ESP ABFS clusters [Cluster Permissions for World Readable]
    • Plan to introduce a change in non-ESP ABFS clusters, which restricts non-Hadoop group users from executing Hadoop commands for storage operations. This change to improve cluster security posture. Customers need to plan for the updates.
  • Open-source upgrades
    • Apache Spark 3.3.0 and Hadoop 3.3.4 are under development on HDInsight 5.1 and will include several significant new features, performance and other improvements.

NOTE:
We advise customers to use to latest versions of HDInsight Images as they bring in the best of open-source updates, Azure updates and security fixes.

Release 2022-12-08

08 Dec 15:52
4393e92
Compare
Choose a tag to compare

Azure HDInsight release notes

This article provides information about the most recent Azure HDInsight release updates.

Summary

Azure HDInsight is one of the most popular services among enterprise customers for open-source analytics on Azure.

Release date: December 12, 2022

This release applies to HDInsight 4.0. and 5.0 HDInsight release is made available to all regions over several days.

HDInsight uses safe deployment practices, which involve gradual region deployment. It may take up to 10 business days for a new release or a new version to be available in all regions.

OS versions

  • HDInsight 4.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
  • HDInsight 5.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4

For workload specific versions, see here.

What's New?

  • Log Analytics - Customers can enable classic monitoring to get the latest OMS version 14.19. To remove old versions, disable and enable classic monitoring.
  • Ambari user auto UI logout due to inactivity. For more information, see here
  • Spark - A new and optimized version of Spark 3.1 is included in this release.

New Regions

  • Qatar Central
  • Germany North

What's changed?

  • HDInsight has moved away from Azul Zulu Java JDK 8 to Adoptium Temurin JDK 8, which supports high-quality TCK certified runtimes, and associated technology for use across the Java ecosystem.

  • HDInsight has migrated to reload4j. The log4j changes are applicable to

    • Apache Hadoop
    • Apache Zookeeper
    • Apache Oozie
    • Apache Ranger
    • Apache Sqoop
    • Apache Pig
    • Apache Ambari
    • Apache Kafka
    • Apache Spark
    • Apache Zeppelin
    • Apache Livy
    • Apache Rubix
    • Apache Hive
    • Apache Tez
    • Apache HBase
    • Apache OMI
    • Apache Pheonix

Updated

HDInsight will implement TLS1.2 going forward, and earlier versions will be updated on the platform. If you're running any applications on top of HDInsight and they use TLS 1.0 and 1.1, upgrade to TLS 1.2 to avoid any disruption in services.

For more information, see How to enable Transport Layer Security (TLS)

End of Support

End of support for Azure HDInsight clusters on Ubuntu 16.04 LTS from 30 November 2022. HDInsight had begun release of cluster images using Ubuntu 18.04 from June 27, 2021. We recommend our customers who are running clusters using Ubuntu 16.04 is to rebuild their clusters with the latest HDInsight images by 30 November 2022.

For more information on how to check Ubuntu version of cluster, see here

  1. Execute the command “lsb_release -a” in the terminal.

  2. If the value for “Description” property in output is “Ubuntu 16.04 LTS”, then this update is applicable to the cluster.

Bug fixes

  • Support for Availability Zones selection for Kafka and HBase (write access) clusters.

Open-source bug fixes

Hive bug fixes

Bug Fixes Apache JIRA
HIVE-26127 INSERT OVERWRITE error - File Not Found
HIVE-24957 Wrong results when subquery has COALESCE in correlation predicate
HIVE-24999 HiveSubQueryRemoveRule generates invalid plan for IN subquery with multiple correlations
HIVE-24322 If there's direct insert, the attempt ID has to be checked when reading the manifest files
HIVE-23363 Upgrade DataNucleus dependency to 5.2
HIVE-26412 Create interface to fetch available slots and add the default
HIVE-26173 Upgrade derby to 10.14.2.0
HIVE-25920 Bump Xerce2 to 2.12.2.
HIVE-26300 Upgrade Jackson data bind version to 2.12.6.1+ to avoid CVE-2020-36518

Release 2021-07-27

28 Jul 21:00
4393e92
Compare
Choose a tag to compare

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

The OS versions for this release are:

  • HDInsight 3.6: Ubuntu 16.04.7 LTS
  • HDInsight 4.0: Ubuntu 18.04.5 LTS

New features

New Azure Monitor integration experience (Preview)

The new Azure monitor integration experience will be Preview in East US and West Europe with this release. Learn more details about the new Azure monitor experience here.

Deprecation

Basic support for HDInsight 3.6 starting July 1, 2021

Starting July 1, 2021, Microsoft offers Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You are automatically enrolled in Basic support starting July 1, 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.

We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments. We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.

Behavior changes

HDInsight Interactive Query only supports schedule-based Autoscale

As customer scenarios grow more mature and diverse, we have identified some limitations with Interactive Query (LLAP) load-based Autoscale. These limitations are caused by the nature of LLAP query dynamics, future load prediction accuracy issues, and issues in the LLAP scheduler's task redistribution. Due to these limitations, users may see their queries run slower on LLAP clusters when Autoscale is enabled. The effect on performance can outweigh the cost benefits of Autoscale.

Starting from July 2021, the Interactive Query workload in HDInsight only supports schedule-based Autoscale. You can no longer enable load-based autoscale on new Interactive Query clusters. Existing running clusters can continue to run with the known limitations described above.

Microsoft recommends that you move to a schedule-based Autoscale for LLAP. You can analyze your cluster's current usage pattern through the Grafana Hive dashboard. For more information, see Automatically scale Azure HDInsight clusters.

Upcoming changes

The following changes will happen in upcoming releases.

Built-in LLAP component in ESP Spark cluster will be removed

HDInsight 4.0 ESP Spark cluster has built-in LLAP components running on both head nodes. The LLAP components in ESP Spark cluster were originally added for HDInsight 3.6 ESP Spark, but has no real user case for HDInsight 4.0 ESP Spark. In the next release scheduled in Sep 2021, HDInsight will remove the built-in LLAP component from HDInsight 4.0 ESP Spark cluster. This change will help to offload head node workload and avoid confusion between ESP Spark and ESP Interactive Hive cluster type.

New region

  • West US 3
  • Jio India West
  • Australia Central

Component version change

The following component version has been changed with this release:

  • ORC version from 1.5.1 to 1.5.9

You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Back ported JIRAs

Here are the back ported Apache JIRAs for this release:

Impacted Feature Apache JIRA
Date / Timestamp HIVE-25104
HIVE-24074
HIVE-22840
HIVE-22589
HIVE-22405
HIVE-21729
HIVE-21291
HIVE-21290
UDF HIVE-25268
HIVE-25093
HIVE-22099
HIVE-24113
HIVE-22170
HIVE-22331
ORC HIVE-21991
HIVE-21815
HIVE-21862
Table Schema HIVE-20437
HIVE-22941
HIVE-21784
HIVE-21714
HIVE-18702
HIVE-21799
HIVE-21296
Workload Management HIVE-24201
Compaction HIVE-24882
HIVE-23058
HIVE-23046
Materialized view HIVE-22566

Release 2021-06-02

03 Jun 19:02
4393e92
Compare
Choose a tag to compare

Release date: 06/02/2021

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

The OS versions for this release are:

  • HDInsight 3.6: Ubuntu 16.04.7 LTS
  • HDInsight 4.0: Ubuntu 18.04.5 LTS

New features

OS version upgrade

As referenced in Ubuntu’s release cycle, the Ubuntu 16.04 kernel will reach End of Life (EOL) in April 2021. We started rolling out the new HDInsight 4.0 cluster image running on Ubuntu 18.04 with this release. Newly created HDInsight 4.0 clusters will run on Ubuntu 18.04 by default once available. Existing clusters on Ubuntu 16.04 will run as is with full support.

HDInsight 3.6 will continue to run on Ubuntu 16.04. It will change to Basic support (from Standard support) beginning 1 July 2021. For more information about dates and support options, see Azure HDInsight versions. Ubuntu 18.04 will not be supported for HDInsight 3.6. If you’d like to use Ubuntu 18.04, you’ll need to migrate your clusters to HDInsight 4.0.

You need to drop and recreate your clusters if you’d like to move existing HDInsight 4.0 clusters to Ubuntu 18.04. Plan to create or recreate your clusters after Ubuntu 18.04 support becomes available.

After creating the new cluster, you can SSH to your cluster and run sudo lsb_release -a to verify that it runs on Ubuntu 18.04. We recommend that you test your applications in your test subscriptions first before moving to production. Learn more about the HDInsight Ubuntu 18.04 update.

Scaling optimizations on HBase accelerated writes clusters

HDInsight made some improvements and optimizations on scaling for HBase accelerated write enabled clusters. Learn more about HBase accelerated write.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Deprecation

No deprecation in this release.

Behavior changes

Disable Stardard_A5 VM size as Head Node for HDInsight 4.0

HDInsight cluster Head Node is responsible for initializing and managing the cluster. Standard_A5 VM size has reliability issues as Head Node for HDInsight 4.0. Starting from this release, customers will not be able to create new clusters with Standard_A5 VM size as Head Node. You can use other two-core VMs like E2_v3 or E2s_v3. Existing clusters will run as is. A four-core VM is highly recommended for Head Node to ensure the high availability and reliability of your production HDInsight clusters.

Network interface resource not visible for clusters running on Azure virtual machine scale sets

HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.

Upcoming changes

The following changes will happen in upcoming releases.

HDInsight Interactive Query only supports schedule-based Autoscale

As customer scenarios grow more mature and diverse, we have identified some limitations with Interactive Query (LLAP) load-based Autoscale. These limitations are caused by the nature of LLAP query dynamics, future load prediction accuracy issues, and issues in the LLAP scheduler's task redistribution. Due to these limitations, users may see their queries run slower on LLAP clusters when Autoscale is enabled. The affect on performance can outweigh the cost benefits of Autoscale.

Starting from July 2021, the Interactive Query workload in HDInsight only supports schedule-based Autoscale. You can no longer enable Autoscale on new Interactive Query clusters. Existing running clusters can continue to run with the known limitations described above.

Microsoft recommends that you move to a schedule-based Autoscale for LLAP. You can analyze your cluster's current usage pattern through the Grafana Hive dashboard. For more information, see Automatically scale Azure HDInsight clusters.

Basic support for HDInsight 3.6 starting July 1, 2021

Starting July 1, 2021, Microsoft will offer Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You'll automatically be enrolled in Basic support starting July 1, 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.

We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments. We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.

VM host naming will be changed on July 1, 2021

HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. This migration will change the cluster host name FQDN name format, and the numbers in the host name will not be guarantee in sequence. If you want to get the FQDN names for each node, refer to Find the Host names of Cluster Nodes.

Bug fixes

HDInsight continues to make cluster reliability and performance improvements.

Component version change

You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Release 2021-03-24

26 Mar 00:11
4393e92
Compare
Choose a tag to compare

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

Spark 3.0 preview

HDInsight added Spark 3.0.0 support to HDInsight 4.0 as a Preview feature.

Kafka 2.4 preview

HDInsight added Kafka 2.4.1 support to HDInsight 4.0 as a Preview feature.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Eav4-series support

HDInsight added Eav4-series support in this release. Learn more about Dav4-series here. The series has been made available in below regions:

  • AUSTRALIA EAST
  • BRAZIL SOUTH
  • CENTRAL US
  • EAST ASIA
  • EAST US
  • JAPAN EAST
  • SOUTHEAST ASIA
  • UK SOUTH
  • WEST EUROPE
  • WEST US 2

Deprecation

No deprecation in this release.

Behavior changes

Default cluster version is changed to 4.0

The default version of HDInsight cluster is changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0.

Default cluster VM sizes are changed to Ev3-series

Default cluster VM sizes are changed from D-series to Ev3-series. This change applies to head nodes and worker nodes. To avoid this change impacting your tested workflows, specify the VM sizes that you want to use in the ARM template.

Network interface resource not visible for clusters running on Azure virtual machine scale sets

HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.

Upcoming changes

The following changes will happen in upcoming releases.

OS version upgrade

HDInsight will be upgrading OS version from Ubuntu 16.04 to 18.04. The upgrade will complete before April 2021.

HDInsight 3.6 offers Basic support on July 2021

Starting July 2021, Microsoft will offer Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You'll automatically be enrolled in Basic support starting July 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.

We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments.

We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.

Bug fixes

HDInsight continues to make cluster reliability and performance improvements.

Component version change

Added support for Spark 3.0.0 and Kafka 2.4.1 as Preview.
You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Release 2021-02-05

09 Feb 07:12
4393e92
Compare
Choose a tag to compare

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

Dav4-series support

HDInsight added Dav4-series support in this release. Learn more about Dav4-series here.

Kafka REST Proxy GA

Kafka REST Proxy enables you to interact with your Kafka cluster via a REST API over HTTPS. Kafka Rest Proxy is general available starting from this release. Learn more about Kafka REST Proxy here.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Deprecation

Disabled VM sizes

Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Behavior changes

Default cluster VM size changes to Ev3-series

Default cluster VM sizes will be changed from D-series to Ev3-series. This change applies to head nodes and worker nodes. To avoid this change impacting your tested workflows, specify the VM sizes that you want to use in the ARM template.

Network interface resource not visible for clusters running on Azure virtual machine scale sets

HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.

Breaking change for .NET for Apache Spark 1.0.0

HDInsight introduces the first major official release of .NET for Apache Spark in the next release. It provides DataFrame API completeness for Spark 2.4.x and Spark 3.0.x along with other features. There will be breaking changes for this major version, refer to this migration guide to understand steps needed to update your code and pipelines. Learn more here.

Upcoming changes

The following changes will happen in upcoming releases.

Default cluster version will be changed to 4.0

Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0.

OS version upgrade

HDInsight is upgrading OS version from Ubuntu 16.04 to 18.04. The upgrade will complete before April 2021.

HDInsight 3.6 end of support on June 30 2021

HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Bug fixes

HDInsight continues to make cluster reliability and performance improvements.

Component version change

No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Release 2020-11-18

24 Nov 04:41
4393e92
Compare
Choose a tag to compare

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

Auto key rotation for customer managed key encryption at rest

Starting from this release, customers can use Azure KeyValut version-less encryption key URLs for customer managed key encryption at rest. HDInsight will automatically rotate the keys as they expire or replaced with new versions. Learn more details here.

Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services

HDInsight previously didn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. From this release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Deprecation

Deprecation of HDInsight 3.6 ML Services cluster

HDInsight 3.6 ML Services cluster type will be end of support by December 31 2020. Customers won't create new 3.6 ML Services clusters after December 31 2020. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.

Disabled VM sizes

Starting from November 16 2020, HDInsight will block new customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing customers who have used these VM sizes in the past three months won't be affected. Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Behavior changes

Add NSG rule checking before scaling operation

HDInsight added network security groups (NSGs) and user-defined routes (UDRs) checking with scaling operation. The same validation is done for cluster scaling besides of cluster creation. This validation helps prevent unpredictable errors. If validation doesn't pass, scaling fails. Learn more about how to configure NSGs and UDRs correctly, refer to HDInsight management IP addresses.

Upcoming changes

The following changes will happen in upcoming releases.

Default cluster version will be changed to 4.0

Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0

HDInsight 3.6 end of support on June 30 2021

HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Bug fixes

HDInsight continues to make cluster reliability and performance improvements.

Component version change

No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Release 2020-11-09

16 Nov 17:16
4393e92
Compare
Choose a tag to compare

Release date: 11/09/2020

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

HDInsight Identity Broker (HIB) is now GA

HDInsight Identity Broker (HIB) that enables OAuth authentication for ESP clusters is now generally available with this release. HIB Clusters created after this release will have the latest HIB features:

  • High Availability (HA)
  • Support for Multi-Factor Authentication (MFA)
  • Federated users sign in with no password hash synchronization to AAD-DS
    For more information, see HIB documentation.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Deprecation

Deprecation of HDInsight 3.6 ML Services cluster

HDInsight 3.6 ML Services cluster type will be end of support by December 31 2020. Customers won't create new 3.6 ML Services clusters after December 31 2020. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.

Disabled VM sizes

Starting from November 16 2020, HDInsight will block new customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing customers who have used these VM sizes in the past three months won't be affected. Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Upcoming changes

The following changes will happen in upcoming releases.

Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services

HDInsight today doesn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. In the upcoming release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.

Default cluster version will be changed to 4.0

Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0

HDInsight 3.6 end of support on June 30 2021

HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.

Bug fixes

Fix issue for restarting VMs in cluster

The issue for restarting VMs in the cluster has been fixed, you can use PowerShell or REST API to reboot nodes in cluster again.

Release 2020-10-08

14 Oct 05:36
4393e92
Compare
Choose a tag to compare

Release date: 10/08/2020

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

HDInsight private clusters with no public IP and Private link (Preview)

HDInsight now supports creating clusters with no public IP and private link access to the clusters in preview. Customers can use the new advanced networking settings to create a fully isolated cluster with no public IP and use their own private endpoints to access the cluster.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Deprecation

Deprecation of HDInsight 3.6 ML Services cluster

HDInsight 3.6 ML Services cluster type will be end of support by Dec 31 2020. Customers won't create new 3.6 ML Services clusters after that. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.

Upcoming changes

The following changes will happen in upcoming releases.

Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services

HDInsight today doesn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. In the upcoming release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.

Component version change

No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.

Release 2020-09-28

02 Oct 20:25
Compare
Choose a tag to compare

Release date: 09/28/2020

This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.

New features

Autoscale for Interactive Query with HDInsight 4.0 is now generally available

Auto scale for Interactive Query cluster type is now General Available (GA) for HDInsight 4.0. All Interactive Query 4.0 clusters created after 27 August 2020 will have GA support for auto scale.

HBase cluster supports Premium ADLS Gen2

HDInsight now supports Premium ADLS Gen2 as primary storage account for HDInsight HBase 3.6 and 4.0 clusters. Together with Accelerated Writes, you can get better performance for your HBase clusters.

Kafka partition distribution on Azure fault domains

A fault domain is a logical grouping of underlying hardware in an Azure data center. Each fault domain shares a common power source and network switch. Before HDInsight Kafka may store all partition replicas in the same fault domain. Starting from this release, HDInsight now supports automatically distribution of Kafka partitions based on Azure fault domains.

Encryption in transit

Customers can enable encryption in transit between cluster nodes using IPSec encryption with platform-managed keys. This option can be enabled at the cluster creation time. See more details about how to enable encryption in transit.

Encryption at host

When you enable encryption at host, data stored on the VM host is encrypted at rest and flows encrypted to the storage service. From this release, you can Enable encryption at host on temp data disk when creating the cluster. Encryption at host is only supported on certain VM SKUs in limited regions. HDInsight supports the following node configuration and SKUs. See more details about how to enable encryption at host.

Moving to Azure virtual machine scale sets

HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.

Upcoming changes

The following changes will happen in upcoming releases.

Ability to select different Zookeeper SKU for Spark, Hadoop, and ML Services

HDInsight today doesn't support changing Zookeeper SKU for Spark, Hadoop, and ML Services cluster types. It uses A2_v2/A2 SKU for Zookeeper nodes and customers aren't charged for them. In the upcoming release, customers can change Zookeeper SKU for Spark, Hadoop, and ML Services as needed. Zookeeper nodes with SKU other than A2_v2/A2 will be charged. The default SKU will still be A2_V2/A2 and free of charge.

Component version change

No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.