BigData-ETL GitHub Repository
Here you can find information which we have put in our blog.
Apache Spark Convert DataFrame to DataSet in Scala - read 1 min!
In this post I will show you how easy is in Apache Spark Convert DataFrame to DataSet in Scala. Many times you might want to have strong typing on your data in Spark. The best to get it is to DataSet instead of DataFrame. In this post I give you simple example how you can get DataSet from data which is coming from CSV file.
...==================
In this post I will try to introduce you to the main differences between Apache Spark ReduceByKey vs GroupByKey methods and why you should avoid the latter. But why? The answer is shuffle.
...==================
You will receive the error [ Default interface methods are only supported starting with Android N] due to the missing compileOptions configuration in the AndroidManifest.xml file.
...==================
[ socket failed EPERM Operation not permitted ] The first place to look for the cause may be to check the file AndroidManifest.xml. Make sure you add permission:
...==================
Problem -> Android
You can encounter this problem when you use org.modelmapper.ModelMapper library and class. In most cases the issue is that you don't provide the Constructors and/or Getters and Setters for you class.
...==================
[SOLVED] How to connect Android Emulator to localhost application- 1 simple solution?
To connect from the Emulator to the application that we have running on [ Connect Android Emulator to localhost application ] your current machine, you must give the appropriate IP address, because Android internally recognizes localhost / 127.0.0.1 as the internal address of the loopback website.
...==================
DialogFragment Android Pass Arguments Or Parameters? - Easy & Quick 1 Min Tutorial!
In this short post I will show [ DialogFragment Android Pass arguments ] you how to dynamically provide arguments to DialogFragment from another Fragment or Activity.
...==================
Apache Airflow Short Introduction And Simple DAG - 2 Cool Secrets to Becoming an Airflow Beginner!
Apache Airflow is a software which you can easily use to schedule and monitor your workflows. [ Apache Airflow Short introduction and simple DAG ]. It's written in Python. As each software Airflow also consist of concepts which describes main and atomic functionalities. In Airflow you will encounter:
...==================
How to install Hortonworks Sandbox with Data Platform in Microsoft Azure? - useful guide - Part 2
Hello! In previous tutorial we created Hortonworks Sandbox virutal machine in Azure (Install Hortonworks Sandbox with Data Platform). In this tutorial I will show you how to connect to this VM and how to use Hortonworks stack.
...==================
Install DBeaver Ubuntu - 1 simple and easy step!
DBeaver Community version is a free alternative (Install DBeaver Ubuntu) to have one tool that can handle many different databases. Written in Java, based on Eclipse.
...==================
How to check if the table has a primary key Oracle Database - 5 Types of cool constraints!
In this tutorial I will show you how to check if the table has a primary key Oracle Database. A constraint is a rule that you define and its task is to protect the table from clutter that may arise as a result of adding incorrect or incomplete data.
...==================
How To Install Oracle SQL Developer On Ubuntu 18.04, 20.04 Or 22.04? - Easy Tutorial In 3 Mins!
In this short post I will show you how to install Oracle SQL Developer on Ubuntu 18.04, Ubuntu 20.04 or even Ubuntu 22.04! Don't waste tour time, just install it!
...==================
SQL Developer How Change Password? - 2 Cool And Simple Methods?
In this short tutorial I will bring you closer to the topic: SQL Developer How Change Password / Reset Password? We can change the password in at least two ways, using:
...==================
In this post I will show you how you can easily run Microsoft ms SQL Server ms docker docker-compose. You won't believe how simple it is! You will have a ready-to-work environment in just a few minutes! Try it yourself!
...==================
In this tutorial I will show you How to install Hortonworks Sandbox with Data Platform in Microsoft Azure. Please log into your portal azure and click into the button Create a resource. In the search text field please type Hortonworks (it will be enought to find the image which we are looking for) and click Create button.
...==================
SQL Server Pivot Function - Converts Rows To Columns - 1 Cool Example!
Sometimes you got business requirements to create dashboard and you must use SQL Server Pivot function to convert rows to columns. You can transpose them on two ways: using CASE or PIVOT function. In this post I will show you how to Convert Rows To Columns at SQL Server on these two ways.
...==================
[ cgroup mountpoint does not exist ] Open terminal and paste the following command. It will fix the issue which you have. [docker cgroup mountpoint]. Than you can run docker client as usual.
...==================
After docker installation you can encounter error: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/json: dial unix /var/run/docker.sock: connect: permission denied
...==================
[SOLVED] Why Oracle database is slow in docker - 1 not obvious reason?
In this short tutorial I will give you the answer to the question: why Oracle database is slow in docker? Recently, when I needed a testing environment, where one of the elements was the Oracle DB, I encountered a problem that caused me a lot of nerves. (Docker Oracle Database Unhealthy)
...==================
[SOLVED] Docker & Windows 10: Error Starting Userland Proxy - Check Simple Solution In 3 Mins!
To resolve this issue from the title [ Error starting userland proxy ] we must dot the following things from this article. Usually use dockers from Linux , but this time I had to fight them under Windows 10. After a few minutes I found out again that everything was easier under Linux, because I was facing problems all the time with correct and stable operation of Docker for Windows. But to the point, what was the problem?
...==================
While you start a job in Talend Big Data, you could see the warning:
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries
==================
[SOLVED] Talend Updating Parameters Configuration Finalized - Check Cool Example In 3 Minutes!
It may happen that you will need to change the values of the configuration parameters (Talend Updating parameters configuration finalized) when they have already been finalised in the Talend Administration Center (TAC).
...==================
Talend Data Integration Free Online Course - 12 Lessons!
Learn Talend and earn a lot of money! Are you wondering how to start your adventure in the IT [ Talend Data Integration Free online course ]? Are you interested in data warehouses and data integration, but you do not know where to start? Talend Data Integration is one of the simplest and most user-friendly ETL tools . Interestingly, the Open Studio version is available to everyone and is a great option for learning Talend tools.
...==================
I would like to present you the game which was created by my brother. We strongly encourage you to play it! :) Let's us know what do you think about this game?
Keep your brain works works at incredible speed by playing this combination of puzzle and skill game. If you've played such titles as Five in row or Lines, be sure to try PagoBalls, which combines the best of them while adding a completely new dynamics to the game.
...==================
Sometime Informatica Repository Navigator disappeared if you undocking your Repository Navigator from one of the PowerCenter tools (e.g. Designer, Workflow Manager) and close it, your Repository Navigator can completely disappeared. So you cannot dock it again, because it is not available.
...==================
[SOLVED] Talend tSAPTableInput DATA_BUFFER_EXCEEDED error rfc - check working solution in 2 mins!
When running a Job in Talend, where you use the tSAPTableInput component, you may encounter the following error: Talend tSAPTableInput DATA_BUFFER_EXCEEDED, which means that data buffer is exceed while retrieving SAP table.
...==================
Jenkins: How To Trigger Another Pipeline From Current Job? - 1 Cool Secret To Do It!
In this tutorial I will show you how to trigger another pipeline from current job. You have probably encountered the situation that you wanted to run another Jenkins job from the current pipeline. It's very easy to achieve it! We will go step by step through the stages of the work phase, its recovery and check the logs.
...==================
This error (OS=Windows and the assembly descriptor contains) may occur during Maven build. The reason is simple and this is due wrong path in <outputDirectory> section in assembly config XML file.
...==================
This is typical error [ Maven could not find artifact ] when you first time use dependencies which are not coming from the most popular maven repository. In this case I wanted to use package which is coming from Confluent Maven repository.
...==================
How To Create Dropdown List In Excel? - Easy & Clear 3 steps!
Manual input of data in MS Excel forms and text cells can lead to many errors and occurrence [ How to create dropdown list in Excel ] of dirty records in your data. If the user makes e.g. a typo in the city name or adds a space at the end of the text, it will be a completely new record in the database. By preparing forms, you can avoid such situations by allowing users to enter only allowed values in selected fields. Drop-down lists will help you with this!
...==================
MySQL How To Check Column Data Types in MySQL Database? - The Ultimate Guide Through Easy 5 ways!
There are at least a few ways to get column data types in MySQL database. I will show you 5 the most popular ways in MySQL How To Check Column Data Types.
...==================
Akka.io Quickstart With Java - Check My 1 Cool Secret!
Very often you can meet the requirement to create an application that will collect data (Akka.io Quickstart with Java) from an external system through its API. It may turn out that the bottleneck of such an application will be the time of its implementation.
...==================
In this post I will show you how you can use the Spotify Dockerfile Maven plugin [ Spring Boot app in Docker using Spotify Dockerfile Maven ] to build an image and then launch the container on its basis.
...==================
Spring Boot Spring Initializer and awsome small simple Web application - check in 5 mins!
In this post I will show you how you can start your adventure with Spring Boot [ Spring Boot Spring Initializer ]. We will create a new project using Spring Initializer and implement a simple controller to display Hello world from web Spring Boot! in your browser.
...==================
What Is A NullPointerException in Java? Let's Understand And Learn How To Avoid It!
In this post I will present you one of the most common error in programming world. What Is A NullPointerException in Java? I will try to explain the topic, present many examples of use and I will try to use diagrams that easily (I hope) show what is happening under the hood in Java code and why you are getting NullPointerException error!
...==================
In this short post I will show you how to using SVN create new branch and clone the repository in the SVN repository and how to make a copy of the main repository in your local branch.
...==================
TortoiseSVN Command Line Interface (CLI) SVN - 1 Min Easy Tutorial!
In this post, I will show you how to enable CLI and run commands on the Windows operating system [ TortoiseSVN Command Line Interface ] for TortoiseSVN. Default settings, you can only use the Graphical User Interface (GUI), so you may have been surprised that the system did not recognise the svn command.
...==================
Talend Studio How to change language - 5 easy steps?
In this short post I will show you topic: Talend Studio How to change language using as the example TOS for Data Integration.
...==================
[SOLVED] Teradata Error 3807 SQLState 42s02 Object 'XYZ' does not exists - easy solution!
In Teradata we can encounter an error [ Teradata Error 3807 SQLState 42s02 ] [Error 3807] [SQLState 42s02] Object 'XYZ' does not exists or Failed [3807: 42s02] Object 'XYZ' does not exists, where 'XYZ' is the name of the object we specified in the query . The cause of this error can be diagnosed very quickly. In this short post I will explain the reason for the error and how to solve it.
...==================
Teradata multiset vs set table - where is the difference - cool example - check in 5 mins?
In this short post you will find out what is the difference between the SET and MULTISET table (Teradata multiset vs set table) and why you need to know the difference before creating your table.
...==================
In this post, you will learn what the Teradata Primary Index is (PI), why it is worth determining it, and you will learn about types of PI. You will also read about the very important feature of Teradata - well, if you do not specify the primary index, it does not mean that it is not there!
...==================
Teradata Error 3653 SQLState 21S02: In this post, I will explain why you encountered the error message [Error 3653] [SQLState 21S02] All select-lists do not contain the same number of expressions, also known as Failed [3653: 21S02] All select-lists do not contain the same number of expressions, I will present the cause of the problem and show how to avoid a error in the future.
...==================
Very often when you first time working with importing data from flat files using BTEQ import function, you can get the error message Failure 2673 The source parcel length does not match data that was defined. In this post I want to show you what is the cause and how to fix it quickly.
...==================
In this post I will show you How to install Teradata Express on VMware version 16.20 on VMware Workstation 15.5.0 Player on the Windows operating system.
...==================
Teradata Convert Rows To Columns - Pivot Rows To Columns - Cool And Easy 3 Example Of Usage!
Sometimes you got business requirements to create new view or dashboard and to do this you [ Teradata convert rows to columns ] need to convert rows to columns. You can transpose them on two ways: using CASE or PIVOT function. In this post I will show you how to convert rows to columns at Teradata on these two ways.
...==================
Teradata Failure 7547 Target row updated by multiple source rows
You will get this error code [ Failure 7547 Target row updated by multiple source rows ] when try update the target table using multiple records from source. It completely does not matter if you use MERGE or direct UPDATE statement. In this post I will show you where is the problem and how to fix it.
...==================
In this tutorial I will show you What is the difference between CASESPECIFIC and NOT CASESPECIFIC data type. When you create tables, you can set a lot of data type attributes like CHARACTER SET, FORMAT, UPPERCASE or (NOT) CASESPECIFIC. CASESPECIFIC attribute specifies case for character data comparisons and collations. What does it mean? In this post I will show you on examples what means that your columns will be case specific or not.
...==================
Data migration between two different database systems always involves the conversion of data types [ SQL Server to Teradata ]. In this post, I will introduce the concept of data migration and show how to convert data types from Microsoft SQL Server to Teradata database types.
...==================
[SOLVED] How to run several commands on a Linux system in parallel mode? - check 1 simple solution
You would like to run multiple commands at the same time, but each of them should be run in a separate thread. The following script will allow you to do this.
...==================
[SOLVED] Maven settings.xml password special characters - 1 simple solution
The problem itself is only that we are using the XML format, which has its limitations on the characters that are allowed.
...==================
Bash Parse Input Arguments And Functions With Parameters - Check Simple 2 Examples!
Bash is basically the modern layer to use of sh. (Bash Arguments Parsing) As we can read on official page bash is described as: It offers functional improvements over sh for both programming and interactive use. In addition, most sh scripts can be run by Bash without modification.
...==================
How To Add A Hostname And IP To The Hosts File On Windows? - Check How Easy It Is In 3 Minutes!
To add a hostname and IP to the hosts file on Windows you need to do few simple steps. Please go through these post and you will learn how to edit hosts file!
...==================
How to add an environment variable on Windows? - check how it's easy is in 2 minutes!
To add an env variable, start the Start Menu and depending on whether you have set the PL or ENG language [ How to add an environment variable on Windows ] in Windows enter the appropriate phrase:
...==================
How To Compare Two Files In Notepad++ v7.8.1 - Check My Secret!
In this post I will show you how to compare two files in Notepad++. In the latest versions of the tool (such as v7.8.1), installing the Compare plugin is extremely simple and quick.
...==================
In this tutorial I will show you how to install Microsoft Teams on Ubuntu 18.04. Recently, Microsoft Teams has been gaining more and more popularity among users. All in all, it is not surprising, because currently on the market it is difficult to find a better tool for communication between people in an organisation or in private use.
...==================
How to install Microsoft Teams on Windows 7, Windows 10 or Windows 11? - easy and short guide!
Let's install Microsoft Teams on Windows! Recently, Microsoft Teams has been gaining more and more popularity among users. All in all, it is not surprising, because currently on the market it is difficult to find a better tool for communication between people in an organisation or in private use.
...==================
How to add JavaFX library to IntelliJ IDEA and Java 11-14? - you won't believe how easy it is!
In this tutorial I will show you how to add JavaFX library to IntelliJ IDEA. With Java 11, JavaFX libraries (javafx with intellij idea) were excluded from the JDK library, so to use it you need to download and manually attach the missing libraries to the project.
...==================
How to use Lombok @Builder in abstract java class - @SuperBuilder? - Check How It Is Easy In 2 Min!
The answer is - you cannot, but you can use @SuperBuilder annotation instead of @Builder.
...==================
Ubuntu Linux: The Brother DCP-J562DW Scanner Does Not Work - 2 Min Easy Solution!
If you've ever encountered a problem [ Brother DCP-J562DW scanner does not work ] with a not working Brother scanner then this post is just for you. I was terribly depressed like every time I had to scan something, the same problem arose - the scanning program, whether Simple Scan or Gscan2pdf did not detect my scanner. But after a few hours of searching for information on the Internet, which may be the reason, I managed to find a solution that I present to you below.
...==================
VSCode Edit All Rows (Multi Positions) In/Or Visual Studio Code - Notepad++ Super Easy 2 Short Tips?
In this post I will show you how you can easily (VSCode Edit all rows) column values in multiple rows at once. Thanks to this trick, you will save a lot of time, which you could lose by tedious editing each row separately.
...==================
VSCode format curly brackets on the same line c# - check simple solution in 2 mins!
[ VSCode format curly brackets on the same line c# ] If you work with other programming languages such as Java or Scala, you have probably been pissed off more than once about how the code in C# is formatted by default. Any method / function, if-else block, or other syntax resulted in more than one line causing a brace to appear on an additional line. See the example below:
...==================
Wordpress & Enlighter Error MooTools Framework not loaded yet! - read easy solution in 2 mins!
In this post I will show you solution for Enlighter Error MooTools Framework! Probably you have installed the cache js and css optimizer for your Wordpress website. I have encountered this issue when I have installed Autoptimize plugin.
...==================
Wordpress Domain Name Change WP-Admin Stopped Working - Quick Fix In 3 Mins!
I know that the topic does not concern the BigData issue in any way, but maybe [ Wordpress domain name change wp-admin stopped ] is looking solution to this problem , to the point! :)
...==================
[SOLVED] Docker: How To Delete Images Tagged none - 1 Cool Tip?
When we often rebuild the application image, which is marked with the same tag, e.g. latest [ How to delete images tagged none ] , we encounter the problem that our image registry (registry) will have many images that will be marked as .
...==================
[SOLVED] Docker The Input Device Is Not a TTY error - Check My 1 Secret To Solve It!
In this post I will show you the super easy solution for error: Docker The input device is not a TTY! This error will occur if you are trying to run container using -it option, which means that you want to run container for interactive processes (like a shell).
...==================
[SOLVED] MessageBodyWriter not found for media type=application/json - Check Simple 1 Min Solution!
MessageBodyWriter not found for media type=application/json : this error occurs, because you have missing jersey-media-json-jackson dependency.
...==================
[SOLVED] SQL Developer How To Restore Connections Tab - Save Your Time! 2 Easy & Short Solutions?
If you have been using SQL Developer for some time, you have probably encountered the situation (SQL Developer How to restore Connections Tab?) that you accidentally or for various reasons closed the Connections tab. However, how to restore or add it again and get a default view in SQL Oracle Developer? In this short post I will show you how to add it in two ways.
...==================
How To Check Column Data Types In SQL Server Database - 2 Secrets To Be SQL Master!
There are at least a few ways to get column data types in Microsoft SQL Server database. I will show you how to check column data types in SQL Server database TWO the most popular ways and explain you the difference between MAX_LENGH (CHARACTER_OCTET_LENGTH) and CHARACTER_MAXIMUM_LENGTH columns.
...==================
SQL Server Extract Year Month Or Day From DATE Data Type? - 2 simple and useful queries!
If you used databases like Oracle or Teradata, you can be surprised that very popular function like [ SQL Server extract year month or day from date ] extract(year from date) not work for Microsoft's databases. In this post I show you how to extract only a day, month or year from Microsoft SQL Server or Azure SQL Database. The both of these databases use Transact-SQL language.
...==================
Maven Could not resolve dependencies for project... Could not transfer artifact... sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertP athBuilderException: unable to find valid certification path to requested target
...==================
IntelliJ IDEA is a Java-based Integrated Development Environment (IDE) that aims to boost developer productivity. It takes care of the mundane and repetitive duties for you by providing intelligent code completion, static code analysis, and refactorings, allowing you to concentrate on the more pleasurable aspects of software development.
...==================
In this post, I will explain why you encountered an error Teradata Error 3504 SQLState HY000 Selected non-aggregate values must be part of the associated group while executing the SQL query in Teradata. I will present the cause of the problem and show how to avoid the error in the future.
...==================
[SOLVED] Teradata Error 3854 SQLState HY000 - is not a view - cool tip in 3 mins!
In this post I will explain why you encountered an error [ Teradata Error 3854 SQLState HY000 ] [Error 3854] <your object name> is not a view while executing the DDL query at Teradata, what caused the problem and how to resolve it.
...==================
Often, instead of the standard username and password, you would like to use a private SSH [ How to connect to server using PuTTY ] key to connect to remote host via SSH. In this short post, I will show you how to connect to your server instance using SSH key and PuTTY on Windows.
...==================
Linux: How to create copy move or delete files and directories? - quick & easy 3 mins guide!
In this post I will show you how to create copy move or delete files and folders at Ubuntu operating system.
...==================
MS Excel VBA How to add Developer tab in Excel? - 2 quick & short steps!
In this tutorial we will focus on topic: MS Excel VBA How to add Developer tab in Excel. Programming in MS Excel VBA can significantly speed up and facilitate your work. Visual Basic for Applications is a programming language implemented in Microsoft Office programs that you can use to write macros. So what is a macro? Macro are programs that can automate your work.
...==================
How To Load Ehcache.xml From External Location Spring Boot? - Quick & Easy Solution In 2 Mins!
In this tutorial we will focus on how to load ehCache.xml from external location Spring Boot. Out of the box the ehCache is looking for the ehCache.xml configuration file in resources path which is packed into jar file. In case when you want to use external ehCache.xml configuration file to can use the VM options and pass there appropriate value.
...==================
Teradata Studio: How to change query font size in SQL Editor? - set beauty font in 5 mins!
In this post I will show you how to change query font size in SQL Editor?! If you need, you can easily change font size in Teradata for Query. Answerset or History window. Increasing or decreasing font size in Teradata can quickly improve your work, specially if you sharing screen with your colleagues. It take only few seconds!
...==================
Do you use Hyper-V virtualisation [ Docker for Windows Hyper-V share the Internet ] create a virtual ma] chine, but it does not have internet access? No problem. We can solve it easily! (Docker Windows Hyper v Hyper-V)
...==================
Microsoft Excel How to put new line at the same cell in file? - quick & cool 1 approach!
When you press the Enter button in Microsoft Excel file, cursor move you to the next cell [ Excel How to put new line at the same cell ]. If you want add break line between lines or paragraphs of your text, you have to use a keyboard shortcut. In this post I will show you how to put new line at the same Excel's cell .
...==================
Jenkins Nexus SonarQube Docker-Compose: Build The DevOps environment - Cool 4 components!
This is another my post in which I present how to build DevOps env (Jenkins Nexus SonarQube Docker-Compose) in a simple way how to set up the environment using the benefits of Docker-Compose.
...==================
GitHub How To Configure Connection Over SSH - Ubuntu 18.04 - A Comprehensive Tutorial!
In this post, we'll focus on configuring communication between (GitHub How to configure connection over SSH) our computer and the GitHub server using the SSH protocol. (Secure shell) for standard communication protocols used in TCP / IP computer networks.
...==================
How to run MySQL database using Docker-Compose in 3 minutes? - cool example!
In this article we will focus only on showing how to quickly run MySQL database using Docker-Compose. In short: docker-compose MySQL. Using volumes, we do not lose changes that we make in the database. Changes will still be visible after closing and restarting the container.
...==================
The issue exists because starts from 2.26 the Jersey is not backward compatible [ injectionmanagerfactory not found ]. For more information please find the information from official Jersey site.
...==================
Apache Kafka How to delete data from Kafka topic? - you probably didn't know these 2 cool methods!
When working with Apache Kafka, there may be a situation when [ Apache Kafka How to delete data from Kafka topic ] we need to delete data from topic, because e.g. during testing junk data was sent, and we have not yet implemented support for such errors, resulting in the so-called poison pill - that is, a record (s) that each time we try to consume from Kafka cause that our processing fails.
...==================
Talend Kafka MongoDB Docker-Compose Real-Time Streaming - Integrate Them Together in 3 Mins?
In today's world, we often meet requirements for real-time data processing (Talend Kafka MongoDB Docker-Compose real-time). There are quite a few tools on the market that allow us to achieve this. At the forefront we can distinguish: Apache Kafka and Apache Flink. Often in the same bag you can still meet Spark Structured Streaming or Spark Streaming, but this is a mistake, because Spark represents an approach that we call micro-batch - that is, processing data in small packages.
...==================
[SOLVED] Apache Spark Check If The File Exists On HDFS? - 1 Min Solution!
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware.
The Hadoop Distributed File System provides high throughput access to application data and is suitable for applications that have large data sets. The Hadoop Distributed File System relaxes a few POSIX requirements to enable streaming access to file system data. The Hadoop Distributed File System was originally built as infrastructure for the Apache Nutch web search engine project. The Hadoop Distributed File System is now an Apache Hadoop subproject. The project URL is https://hadoop.apache.org/hdfs/.
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
[ Apache Spark Check if the file exists on HDFS ] We will use the FileSystem and Path classes from the org.apache.hadoop.fs library to achieve it. (Apache Spark Check if the file exists on HDFS?)
...==================
[SOLVED] Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!
Let's start with configuration of Apache Spark Scala and IntelliJ! I assume that you have already installed IntelliJ IDEA software (in otherwise please go to official IntelliJ website and download the community edition).
...==================
Apache Spark Save DataFrame As a Single File HDFS - 1 Min Solution?
In this tutorial I will show the example when using Apache Spark Save DataFrame as a single file HDFS. If you want to save DataFrame as a file on HDFS, there may be a problem that it will be saved as many files.
...==================
Apache Spark Use DataFrame Efficiently During Reading Data? - Check My 3 Secret Tips!
In this short Apache Spark tutorial I will show you (Apache Spark Use DataFrame efficiently during reading data) how to use the DataFrame API to increase the performance of the Spark application, while loading large, semi-structured data sets such as CSV, XML and JSON.
...==================
Let's consider two methods to read the data from the same Hive table. For both the execution plan will be the same, [ CreateOrReplaceTempView Performance ] because for both the Catalyst optimiser and Tangsten engine will be used, which were available since Spark 2.0 (Tangsten). In the future I will prepare posts about these two buzzwords in Spark world (Catalyst and Tangsten).
...==================
How to Apache Spark Break DAG lineage - Do you know these 3 cool methods?
In this post, I will introduce you to 3 methods how to Apache Spark Break DAG lineage. It's very possible that 1 of them you weren't even aware of! Check if you know all 3 methods which, depending on the conditions and requirements, can save you a lot of time!
...==================
I have gathered to write this entry for a long time about Football Match Prediction. One day, when I was playing with the capabilities of the Apache Spark MLib library, I came up with an idea …
...==================
Apache Spark Machine Learning Predicting Diabetes In Patients - Make Your 1st Cool ML!
Today I will show you how you can use Machine Learning libraries (ML) (Apache Spark Machine Learning predicting diabetes ), which are available in Spark as a library under the name Spark MLib.
...==================
Apache Spark Rename Or Delete a File HDFS - Great Example In 1 Minute?
In this short post I will show you how you can using Apache Spark rename or delete a file HDFS.
...==================
How to install Apache Spark Standalone in CentOs 7? - check how it is easy in 5 mins!
In this tutorial I will show you how you can easily install Apache Spark Standalone in CentOs 7. First of all you have to install Java on your machine.
...==================
How to run shell command in Scala from the code level - check great code snippet in 1 minute?
Scala is a high-level programming language that mixes object-oriented and functional programming. Scala's static types help complicated applications avoid problems, and its JVM and JavaScript runtimes allow you to construct high-performance systems with simple access to a vast library ecosystem.
...==================
[ Apache Hive Convert ORC to Parquet ] In this short tutorial I will give you a hint how you can convert the data in Hive from one to another format without any additional application.
...==================
In this post I will show you how to save data ORC Parquet Text CSV in Hive in few ways how you can export data from Hive to csv file. For this tutorial I have prepared table test_csv_data with few records into this table.
...==================
How to copy files from one directory to another on HDFS (Hadoop)? - Start doing it the 1 right way!
We often encounter the need to copy data between directories on HDFS on Hadoop. [ How to copy files from one directory to another on HDFS ]Generally, we can perform such an operation in several ways, depending on how complex our copying case is, because it may happen that you want to copy only non-existing files (without overwriting them), or maybe you want to make all the files have been copied again.
...==================
Run Cloudera QuickStart using Docker - easy steps & setup in 5 mins!
In this short post I will show how you can run the Cloudera QuickStart using Docker. As you know from my previous post I am big fan of dockers and of all the stuff related to dockers. It's great tool and I am using dockers in many situations, because it's very easy to setup the run specific application or setup complex environment in few minutes. Additional it allows us to easily manage this application.
...==================
Connection Between Talend And Cloudera. Check My Secret! 3 Simple & Easy Steps!
In this post I will show you how to set up a Connection between Talend and Cloudera to be able to connect to CDP. When you will go through this tutorial you will be able to use Talend and connect to Cloudera.
...==================
[ROZWIĄZANY] Jak połączyć Android Emulator aplikacji localhost - 1 super proste rozwiązanie?
Jak połączyć Android Emulator aplikacji localhost, którą mamy uruchomioną na Twojej aktualnej maszynie, należy nadać odpowiedni adres IP, ponieważ Android wewnętrznie rozpoznaje localhost/127.0.0.1 jako wewnętrzny adres serwisu loopback.
...==================
Powyższy błąd otrzymasz ze względu na brakującą konfigurację compileOptions [ Default interface methods are only supported starting ...] w pliku AndroidManifest.xml.
...==================
Apache Spark RDD ReduceByKey vs RDD GroupByKey - różnice i porównanie - proste i porównanie w 5 min!
W tym poście spróbuję przedstawić Ci główną różnice pomiędzy metodami Apache Spark RDD ReduceByKey vs RDD GroupByKey i dlaczego powinieneś unikać tej drugiej. A dlaczego? Odpowiedź kryję się pod pojęciem shuffle.
...==================
Przyczyna 1 -> Android Emulator SocketException socket failed EPERM
Pierwszym miejscem, gdzie należy szukać przyczyny [ SocketException socket failed EPERM ] może powinno być sprawdzenie pliku AndroidManifest.xml. Upewnij się, że dodane zostały uprawnienia:
...==================
W tym poście pokażę ci, jak łatwo uruchomić bazę danych Microsoft SQL Server docker i docker-compose.
...==================
Witaj! W tym krótkim samouczku pokażę Ci, w jaki sposób można konwertować dane (Apache Hive konwertowanie z ORC do Parquet) z jednego formatu na inny bez wykorzystania dodatkowej aplikacji.
...==================
Talend Kafka MongoDB Docker-Compose - Strumień Danych
W tym poście naszym celem będzie stworzenie przetwarzania strumieniowego składającego się z: Talend Kafka MongoDB Docker-Compose - w czasie rzeczywistym przy wykorzystaniu architektury stworzonej przy użyciu docker-compose. Logikę przetwarzania zaimplementujemy w Talend Open Studio for Big Data (TOSBD).
...==================
Apache Spark Efektywne wykorzystanie DataFrame wczytywanie danych? - 3 proste wskazówki
W tym krótkim samouczku pokażę, jak korzystać z API DataFrame [ Apache Spark Efektywne wykorzystanie DataFrame ] w celu zwiększenia wydajności aplikacji Spark'owej, podczas wczytywania dużych, pół-strukturalnych zbiorów danych, takich jak CSV, XML i JSON.
...==================
Apache Spark Jak sprawdzić czy plik istnieje na HDFS? - Sprawdź w 1 minutę!
Wykorzystamy do tego klasę FileSystem oraz Path [ Apache Spark Jak sprawdzić czy plik istnieje na HDFS ] z biblioteki org.apache.hadoop.fs. Dzięki temu możemy w kilku liniach kodu sprawdzić czy plik istnieje na HDFS (Hadoop Distributed FileSystem).
...==================
Apache Spark Scala IntelliJ: Cześć! Zakładam, że masz już zainstalowany u siebie program IntelliJ IDEA. Jeśli nie to proszę udaj się do oficjalnej strony i pobierz wersje community.
...==================
Jak zainstalować Apache Spark Standalone na CentOs? - sprawdź 4 proste kroki!
W tym poście pokażę Ci Jak zainstalować Apache Spark Standalone na CentOs. Przede wszystkim musisz zainstalować Javę na swojej maszynie.
...==================
Apache Spark Jak zmienić nazwę usunąć plik HDFS - 1 minuta?
W tym krótkim poście przybliżę temat: Apache Spark Jak zmienić nazwę usunąć plik HDFS.
...==================
Rozważmy dwie metody odczytu danych z tej samej tabeli Hive (Apache Spark SQL a API DataFrame DataSet) .W obu przypadkach plan wykonania będzie taki sam, ponieważ w obu przypadkach użoty optymalizatora Catalyst oraz silnika Tangsten, który jest dostępny od Sparka 2.0 (Tangsten). W przyszłości przygotuję posty na temat Catalyst i Tangsten, aby przybliżyć Ci ich działanie.
...==================
Apache Spark Zapis DataFrame jeden plik HDFS? - 1 min czytania
W tym poście spojrzymy na zagadnienie: Apache Spark Zapis DataFrame jeden plik HDFS. Jeśli chce zapisać DataFrame jako plik na HDFS to może się pojawić problem, że zostanie on zapisany w postaci wielu plików. Jest to jak najbardziej poprawne zachowanie i wynika to ze zrównoleglania pracy w Apache Spark. Jednak jeśli chcemy wymusić zapis do jednego pliku należy zmienić partycjonowanie DF do jednej partycji. W tym celu należy przed zapisem wywołać metodę coalesce i podać ilość partycji.
...==================
Scala Jak uruchomić komendę shell z poziomu kodu - sprawdź jakie to proste w 1 minutę?
Aby wykonać komendę shell (Scala Jak uruchomić komendę shell) należy zaimportować bibliotekę scala.sys.process a następnie skorzystać z dostępnego Domain Specific Language (DSL) zdefiniowanego znakiem wykrzyknika (!). Poniżej przedstawiam fragment kodu, który może pomoc Ci zobaczyć jak wygląda użycie tej biblioteki na przykładzie:
...==================
[ROZWIĄZANY] Baza danych Oracle działa wolno w Docker - 1 nieoczywisty powód?
Problem z tym, że baza danych Oracle działa wolno w Docker nie jest taki oczywisty jakby się mogło wydawać. Ostatnimi czasy, gdy potrzebowałem środowiska do testów, gdzie jednym z elementów była baza Oracle, napotkałem problem, który przysporzył mi wiele nerwów. (Baza danych Oracle Docker)
...==================
Docker & Windows 10: Error starting userland proxy - prosty powód! [ROZWIĄZANY]
[ Error starting userland proxy ] Z dobrodziejstwa docker'ów zazwyczaj korzystam z poziomu Linuxa, ale tym razem zmuszony byłem powalczyć z nimi pod Windows 10. Po kilku minutach kolejny raz przekonałem się, że pod Linuxem wszystko jest łatwiejsze, ponieważ co rusz napotykałem jakieś problemy z poprawnym i stabilnym działaniem Docker for Windows. Ale do rzeczy, w czym był problem?
...==================
Akka.io Szybki start Java - 1 spotkanie z biblioteką!
Akka.io Szybki start Java: Bardzo często możesz spotkać się z wymaganiem stworzenia aplikacji, której zadaniem będzie gromadzenie danych z zewnętrznego systemu poprzez jego API. Może się okazać, że wąskim gardłem takiej aplikacji będzie czas jej wykonania.
...==================
Spring Boot i Spotify Dockerfile Maven: Aplikacja w Docker przy użyciu wtyczki - wystarczy 3 minuty!
W tym poście przedstawię Ci w jaki sposób możesz wykorzystać wtyczkę Spring Boot i Spotify Dockerfile Maven w celu zbudowania obrazu, a następnie uruchomienia kontenera na jago podstawie.
...==================
W tym poście tematem będzie Spring Boot Spring Inicjalizacja projektu. Utworzymy nowy projekt używając Spring Initializer oraz zaimplementujemy prosty kontroler do wyświetlania napisu Hello world from web Spring Boot! w Twojej przeglądarce.
...==================
Talend Data Integration kurs online - 12 super lekcji na początek!
Zastanawiasz się jak rozpocząć swoją przygodę w świecie IT? [ Talend Data Integration Kurs Online ] Interesują Cię hurtownie i integracje danych, ale nie wiesz od czego zacząć? Jednym z najprostszych i najbardziej przyjaznych oku narzędzi do przetwarzania danych i szeroko rozumianych procesów ETL jest Talend Data Integration. Co ciekawe, wersja Open Studio jest dostępna dla wszystkich i jest świetną opcją do nauki narzędzi Talend.
...==================
Instalacja TortoiseSVN Line Interface (CLI) SVN - instalacja w 2 mintuty!
W tym poście pokaże jak umożliwić wykonywanie komend jak przebiega instalacja TortoiseSVN w systemie operacyjnym Windows. W domyślnych ustawieniach możliwe jest jedynie wykorzystanie graficznego interfejsu (Graphical User Interface - GUI), więc możliwe, że byłeś zaskoczony iż system nie rozpoznał polecenia svn.
...==================
SVN Jak utworzyć nowy branch i sklonować repozytorium - 1 proste rozwiązanie?
W tym krótkim poście przedstawię Ci temat: SVN jak utworzyć nowy branch w repozytorium oraz jak zrobić kopię głównego repozytorium w Twoim lokalnym branch'u.
...==================
Talend Studio Jak zmienić język - 5 szybkich kroków?
W tym krótkim poście przedstawie temat: Talend Studio Jak zmienić język na przykładzie w TOS for Data Integration.
...==================
[ROZWIĄZANY] Teradata Error 3807 SQLState 42s02 Object 'XYZ' does not exists
Dość często w Teradata możemy się spotkać z błędem [ Teradata Error 3807 SQLState 42s02 Object 'XYZ' does not exists ], gdzie 'XYZ' jest nazwą obiektu jaki podaliśmy w zapytaniu. Przyczynę niniejszego błędu można bardzo szybko zdiagnozować. W tym krótkim poście omówię przyczynę błędu oraz jak go rozwiązać.
...==================
Teradata Indeks główny - unikalny, nieunikalny a może jego brak - sprawdź 3 typy indeksu?
W niniejszym poście dowiesz się czym jest Teradata indeks główny tabeli (ang. PI - primary index) , dlaczego warto go określić oraz poznasz jego typy. Przeczytasz także o bardzo ważnej właściwości Teradaty - otóż jeśli nie określisz indeksu głównego, to nie oznacza, że go nie ma!
...==================
Teradata tabela multiset vs set - gdzie jest różnica - sprawdź w 5 minut?
W tym krótkim poście dowiesz się jaka jest różnica między tabelą SET i MULTISET (Teradata tabela multiset vs set) oraz dlaczego musisz znać te różnicę przed utworzeniem swojej tabeli.
...==================
Teradata Error 3653: W niniejszym poście wyjaśnię dlaczego napotkaliście błąd Teradata Error 3653 SQLState 21S02 All select-lists do not contain the same number of expressions, znany także jako Failed [3653 : 21S02] All select-lists do not contain the same number of expressions, przedstawię przyczynę problemu oraz pokażę jak uniknąć błędu w przyszłości.
...==================
Instalacja Teradata Express VMware Workstation Player na systemie Windows - zrób to sam w 5 minut!
W tym poście pokaże jak przebiega instalacja Teradata Express VMware Workstation Player w wersji 16.20 na VMware Workstation 15.5.0 Player na systemie operacyjnym Windows.
...==================
Jak uruchomić bazę danych MySQL Docker-Compose w 3 minuty?
W tym poście pokażę Ci Jak uruchomić bazę danych MySQL Docker-Compose. O Docker'ach i Docker-Compose można by napisać bardzo wiele, gdyż dają one nieograniczone możliwości. W tym poście skupimy się jednak tylko na pokazaniu w jaki sposób można bardzo szybko uruchomić bazę MySQL w kontenerze.
...==================
GitHub Jak skonfigurować połączenie po SSH - Ubuntu 18.04 - sprawdź jakie to proste!
W tym krótkim poście przedstawię Ci temat: Github Jak skonfigurować połączenie po SSH?. Skupimy się nad konfiguracją komunikacji pomiędzy Twoim komputerem a serwerem GitHub przy użyciu protokołu SSH. Innymi słowy wykonamy dodanie klucza SSH do konta GitHub!
...==================
Piłka nożna, Uczenie maszynowe, przewidywanie wyników meczów: Zbierałem się do napisania tego wpisu już od dłuższego czasu. Pewnego dnia, gdy bawiłem się możliwościami biblioteki Apache Spark MLib wpadł mi do głowy pewien pomysł…
...==================
Migracja danych między dwoma różnymi systemami baz danych zawsze [ SQL Server do Teradata] wiąże się z konwersją typów danych. W tym poście przybliżę pojęcie migracji danych oraz pokażę jak zamienić typy danych z Microsoft SQL Server na typy danych bazy Teradata.
...==================
Bash Argumenty wejściowe Funkcje z parametrami - sprawdź 2 proste przykłady!
Bash jest w zasadzie nowoczesną warstwą do wykorzystania w sh. Jak możemy przeczytać (Bash argumenty wejściowe funkcje z parametrami) na oficjalnej stronie bash jest opisany jako: Zawiera ulepszenia funkcjonalne względem sh zarówno w zakresie programowania, jak i interaktywnego użycia. Ponadto większość skryptów sh może być uruchamiana przez Bash bez modyfikacji.
...==================
Intellij IDEA gitignore nie ignoruje folderu .idea - napraw to w 2 minuty!
Natura .gitignore polega na tym, że ignorowane są tylko nowo dodane pliki. [ Intellij IDEA gitignore ] W przypadku, gdy w twoim repozytorium został dodany katalog np. wspomniany .idea to zmiany w tych katalogach czy plikach plikach są zwyczajnie śledzone przez Git'a. Aby rozwiązać problem, musisz usunąć tę ścieżkę z git cache. Ten sam problem tyczy to każdego innego przypadku, czyli dowolnego katalogu lub pliku.
...==================
Jak porównać dwa pliki w Notepad++ >= v7.8.1 - sprawdź super metodę!
W tym poście pokaże jak porównać dwa pliki w Notepad++. W najnowszych wersjach narzędzia (jak np. v7.8.1) doinstalowanie plugin'u Compare jest niezwykle proste i szybkie.
...==================
Jak zainstalować Microsoft Teams na Ubuntu 18.04 (Linux)? wystarczy 5 min!
W tym poście pokaże Ci Jak zainstalować Microsoft Teams na Ubuntu 18.04. Microsoft Teams w ostatnim czasie zyskuję coraz większą popularność wśród użytkowników. W sumie nie ma się co dziwić, ponieważ aktualnie na rynku cieżko jest znaleźć lepsze narzędzie do komunikacji pomiędzy ludźmi w organizacji czy w użytku prywatnym.
...==================
Jak zainstalować Microsoft Teams na Windows 7, Windows 10 lub Windows 11?
W tym poście pokażę Ci Jak zainstalować Microsoft Teams na Windows! Microsoft Teams w ostatnim czasie zyskuję coraz większą popularność wśród użytkowników. W sumie nie ma się co dziwić, ponieważ aktualnie na rynku cieżko jest znaleźć lepsze narzędzie do komunikacji pomiędzy innymi osobami w organizacji czy w użytku prywatnym.
...==================
Windows: Jak dodać nazwę oraz IP serwera do pliku hosts? - sprawdź w 2 minuty!
Krótki samouczek jak Jak dodać nazwę oraz IP serwera do pliku hosts pod Windows, czyli jak zmienić plik hosts!
...==================
Wyczyścimy wszystkie obrazy oznaczone jako <none> [ Docker Jak usunąć obrazy oznaczone none <none>]. Gdy często przebudowujemy obraz aplikacji, który oznaczony jest tym samym tag'iem np. latest, spotkamy się z problemem, że nasz rejestr obrazów (registry), będzie posiadało wiele obrazów, które będą oznaczone jako <none>.
...==================
Jak utworzyć środowisko Hortonworks Sandbox Data Platfrom w chmurze Microsoft Azure? - Część 2
Witaj! W poprzednim tutorialu stworzyliśmy wirtualną maszynę Hortonworks Sandbox na platformie Azure (Hortonworks Sandbox Data Platfrom w chmurze). W tym samouczku pokażę, jak połączyć się z tą maszyną i jak korzystać ze platformy Hortonworks.
...==================
Ubuntu Linux: Nie działa skaner Brother DCP-J562DW - super rozwiązanie w 3 min!
Jeżeli kiedykolwiek spotkałeś się z problemem [ Nie działa skaner Brother DCP-J562DW] nie działającego skanera to ten post jest właśnie dla Ciebie. Strasznie mnie denerowało jak za każdym razem gdy musiałem coś zeskanować pojawiał się ten sam problem - program do skanowania czy to Simple Scan czy Gscan2pdf nie wykrywał mojego skanera. Lecz po kilku godzinach szukania informacji w internecie co może być przyczyną udało mi się znaleźć rozwiązanie, które przedstawiam Ci poniżej.
...==================
Wordpress Enlighter Error MooTools Framework not loaded yet! - Rozwiąż Problem w 2 minuty!
[ Enlighter Error MooTools Framework not loaded yet ] Prawdopodobnie zainstalowałeś optymalizator cache js i css dla swojej witryn. U mnie ten problem wystąpił po zainstalowaniu wtyczki Autoptimize:
...==================
Korzystamy z wirtualizacji Hyper-V (Docker for Windows Hyper-V Udostępnić Internet), tworzymy maszynę wirtualną, ale nie ma ona dostępu do internetu? Żaden problem. Możemy to łatwo rozwiązać!
...==================
1. Utwórz nowy zasób na platformie Azure, używając istniejąceego obrazu Hortonworks Sandbox Data Platform
W tym tutorialu pokaże Ci: jak utworzyć środowisko Hortonworks Sandbox Data Platfrom w chmurze Microsoft Azure. Zaloguj się do swojego portalu Azure i kliknij przycisk Create a resource. W polu wyszukiwania wpisz Hortonworks (to wystarczy, aby szukany przez nas obraz został znaleziony) i kliknij przycisk Create.
...