-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Download java jdk in version of 1.8
Download Hadoop version 2.8.0 https://hadoop.apache.org/release/2.8.0.html download the tar.gz file and then we need to do some configuration in hadoop files.
These files are list out in hadoop2.8.0/etc/hadoop/
- coresite.xml
- hdfssite.xml
- yarnsite.xml
- mapredsite.xml
- hadoopenv.cmd
Do configuration like this:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
To edit this file before You need to create one directory in hadoop-2.8.0 and named it as data:
- In that folder you need create two more folders one is namenode and datanode
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/hadoop-2.8.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/hadoop-2.8.0/data/datanode</value>
</property>
</configuration>
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- In this file you need put java path
set JAVA_HOME=C:\Java\jdk1.8.0_202
- Download the zip folder and extract it copy the bin folder and replace the bin in your hadoop-2.8.0/bin http://backend.onstep.in/hadoopconfig/
- Open the Command promt in administrator mode.
- Here finally hadoop is currently located in path
- The Next step you need to format your namenode in the comand promt
- hdfs namenode -format
- Atlast we have to start hadoop.
- start-all.cmd
- After this command four command prompt will be open namenode, datanode,nodemanager,yarnresoursemanager
1.Download hive 2.1.1 https://archive.apache.org/dist/hive/hive-2.1.1/ download the tar.gz file 2.Download derby 10.12.1.1 https://archive.apache.org/dist/db/derby/db-derby-10.12.1.1/ download the tar.gz file
Extrat those download files.
Go to derby folder copy the lib folder and paste it in the hive folder Go to hive folder in that hive folder go to conf folder create new file with the name of
hive-site.xml
And then pase these data into that file
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration><property> <name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property><property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.ClientDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>hive.server2.enable.impersonation</name>
<description>Enable user impersonation for HiveServer2</description>
<value>false</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<description>Enable user impersonation for HiveServer2</description>
<value>false</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NOSASL</value>
<description> Client authentication types. NONE: no authentication check LDAP: LDAP/AD based authentication KERBEROS: Kerberos/GSSAPI authentication CUSTOM: Custom authentication provider (Use with property hive.server2.custom.authentication.class) </description>
</property>
<property>
<name>datanucleus.autoCreateTables</name>
<value>True</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
</configuration>
1.Add derby path and hive path in environment varibles
First Ensure that you hadoop is running or not, if you not run the hadoop means run the hadoop Second Thing you need start the derby server using this comman startNetworkServer -h 0.0.0.0
While using some pip you will face some erros(Visual Studio Build errors) To overcome this error we going to use one package[bitarray]
Go to this Website https://www.lfd.uci.edu/~gohlke/pythonlibs/#bitarray go to bitarray area download the file which realated to your python version.
- And then go to the file downloaded area open the command promt and do install using pip commands
For example pip install bitarray-1.7.1-cp39-cp39-win_amd64.whl
pip install impyla
pip install thrift_sasl
First Ensure that you hadoop,derby,hive is running or not, if you not run the hadoop,derby,hive means run the hadoop,derby,hive
Next Step you need to hive server with the help of this command
hive --service hiveserver2
.
import impala
from impala.dbapi import connect
c=connect(port=10000).cursor()
c.execute("show databases")
c.fetchall()