Skip to content

rbrtjwrk/data_wrangling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 

Repository files navigation

data wrangling

data_for_plot

data_for_plot(dataframe, group_by, select) function takes as an input a datarame of the following format:

Year AIS SJR B C
2015 15044 29876 1708 200
2016 16234 31051 1200 300
2017 18001 35015 998 777

And based on specified parameters returns reorganized dataframe, e.g. for data_for_plot(dataframe, "Year", ["B", "C"]) we get:

Year Values Type
2015 1708 B
2015 200 C
2016 1200 B
2016 300 C
2017 998 B
2017 777 C

Which is quite useful if we want to further visualize the data using ggplot2.

docx_to_txt

process_docx_to_txt(directory) function processes all .docx files in given directory to .txt files.

onehot

onehot(dataframe, labels_colname) function encodes a dataframe containting a column with exactly one label per one row to onehot. Returns only the onehot encoded dataframe without any data from the original one. See:

book label
Normal People novel
Outline novel
Inventing the Future politics

novel politics
1 0
1 0
0 1

About

A set of useful functions for working with data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages