# Introduction to H2O

H2O is fast, scalable, open-source machine learning and deep learning for Smarter Applications. Using in-memory compression techniques, H2O can handle billions of data rows in-memory, even with a fairly small cluster. The platform includes interfaces for R, Python, Scala, Java, JSON and Coffeescript/JavaScript, along with a built-in web interface, Flow, that make it easier for non-engineers to stitch together complete analytic workflows. The platform was built alongside (and on top of) both Hadoop and Spark Clusters and is typically deployed within minutes.

H2O implements almost all common machine learning algorithms, such as generalized linear modeling (linear regression, logistic regression, etc.), Naïve Bayes, principal components analysis, time series, k-means clustering, and others. H2O also implements best-in-class algorithms such as Random Forest, Gradient Boosting, and Deep Learning at scale. Customers can build thousands of models and compare them to get the best prediction results.

## Requirements
At a minimum, we recommend the following for compatibility with H2O:

#### Operating Systems:
Windows 7 or later
OS X 10.9 or later
Ubuntu 12.04
RHEL/CentOS 6 or later

#### Languages:
Scala, R, and Python are not required to use H2O unless you want to use H2O in those environments, but **Java is always required**. 

Supported versions include:
Java 8, 9, 10, 11, 12, and 13
To build H2O or run H2O tests, the 64-bit JDK is required.
To run the H2O binary using either the command line, R, or Python packages, only 64-bit JRE is required.
Both of these are available on the [Java download page](https://www.oracle.com/java/technologies/javase-downloads.html).

* Scala 2.10 or later
* R version 3 or later
* Python 2.7.x, 3.5.x, 3.6.x

#### Browser:
An internet browser is required to use H2O’s web UI, Flow. Supported versions include the latest version of Chrome, Firefox, Safari, or Internet Explorer.

#### HDFS:
Hadoop: Hadoop is not required to run H2O unless you want to deploy H2O on a Hadoop cluster. Supported versions are listed on the Download page (when you select the Install on Hadoop tab) and include:

* Cloudera CDH 5.4 or later
* Hortonworks HDP 2.2 or later
* MapR 4.0 or later
* IBM Open Platform 4.2

Spark: Version 2.1, 2.2, or 2.3. Spark is only required if you want to run Sparkling Water.

## Installation in Python 

#### Install Java

Download Java (NOT Java-14) from the [Java download page](https://www.oracle.com/java/technologies/javase-downloads.html). After installed Java, set `PATH` in system variable for Windows. Here is the [reference](https://javatutorial.net/set-java-home-windows-10) about setting.

#### Install H2O

Run the following commands in a Terminal window to install `H2O` for Python.

1. Install dependencies
```
pip install requests
pip install tabulate
pip install "colorama>=0.3.8"
pip install future
```

2. Install H2O Python module
```
pip install -f http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html h2o
```

3. Initialize H2O in Python and Test
```
import h2o
h2o.init()
h2o.demo("glm")
```