Skip to content

4RandomProgrammer/PMDTrab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pre test Commands

Firstly boot up the docker environment by executing this command in this folder:

docker compose up -d

Before testing the environment we need to create a table in Apache Kudu via Impala:

docker exec -it impala impala-shell

In the Impala shell, use the following commands to insert a new table called jogos and equipes

connect;   -- (run again if not successful)

CREATE TABLE jogos (
  partida STRING,
  mapa STRING,
  equipe1 STRING,
  equipe2 STRING,
  vitorioso STRING, 
  ct STRING,
  tr STRING,
  PRIMARY KEY(partida, mapa)
)
STORED AS KUDU;

CREATE TABLE proc (
  partida STRING,
  mapa STRING,
  equipe1 STRING,
  equipe2 STRING,
  vitorioso STRING,
  md STRING,
  PRIMARY KEY(partida, mapa)
)
STORED AS KUDU;

CREATE TABLE equipes (
  equipe STRING,
  jogos DECIMAL(8, 5),
  vitorias DECIMAL(8, 5),
  derrotas DECIMAL(8, 5),
  md1 DECIMAL(8, 5),
  md2 DECIMAL(8, 5),
  md3 DECIMAL(8, 5),
  md5 DECIMAL(8, 5),
  jmd1 DECIMAL(8, 5),
  jmd2 DECIMAL(8, 5),
  jmd3 DECIMAL(8, 5),
  jmd5 DECIMAL(8, 5),
  PRIMARY KEY(equipe)
)
STORED AS KUDU;

exit;

After creating table you can stop impala:

docker stop impala

Now, check the logs of the pyspark-notebook container to find the link with the token to access the pyspark environment:

docker logs pyspark-notebook

Finally, to execute the tests enter the pyspark environment, upload the data_insertion notebook and run all cells.

About

Trabalho da matéria de Processamento massivo de dados. Um banco de dados poliglota utilizando Neo4J e o Apache Kudu para realizar uma aplicação que consta estatisticas e avaliações de jogos de CS:GO

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors