Skip to content

Commit

Permalink
Merge pull request #115 from ironmussa/1.0.1
Browse files Browse the repository at this point in the history
Fix testing for airbrake (and everything else for 1.0.1). Closes #117.
  • Loading branch information
FavioVazquez committed Sep 18, 2017
2 parents fb0ce15 + c87a70d commit 7a11ac1
Show file tree
Hide file tree
Showing 16 changed files with 524 additions and 336 deletions.
6 changes: 5 additions & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[run]
omit =
optimus/*
setup.py
setup.py
[report]
exclude_lines =
pragma: no cover
except RuntimeError as err:
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ env:
- SPARK_HOME=/tmp/spark-2.2.0-bin-hadoop2.7
- secure: "sQ3+ELlqK6QdAyH1J4HmtxtKKA4Bk4Hcu+Hmo3Tslynh0j55+CV2j73jYwch24yGHKheoSA8RVdoP52MoguWo+2ciKIZiumShRNKx1zsYWoJFSb0JlGHogauWNuKEPEzHz8nrw3Z8btyNVFGC25WxXJ490lZ43QnMQ2TpGWNtx7Un/LuZhFFEDJQcl4X5SClGbv8beImKGRHh/DNpjzaw9/EZhY+xT3p820dvm2jdDlKtz8IOE2H3XeAM7raQplio1l5zfSjDIhOy+ESo3uGol4PHjhVe2oIpPnBf0BrYIDYff4L9/TUmgWa875hy8CX4E1T6w89YHFVTI4gRa9YEVpgFRe8AW4XI3CdCTmYMcyf1R2xAW7xOXCBHJegx/kLO2yT3BBScjZc7QhRnXk4587P8t3ciOsPEU0U1LzKgftyvc+aJzBFal3IRaRLI1Ji03PY/JESmjKMz3eVT9bLb8aHdaGF+qH9AC8v1Zcek34ip98DP0qww1QK4AQ/F0IC4TFe6KAs1df8ub8x7hWd7rPEVBThKPatPyQ31oXYHlkuPbXI7/6QWylPBtRwX51Wp9QzgDevCOoIR7JnxYQynAf45TZmlAISBcwo1N2KoxaXOAtytxWFzcaXAoYTwvXTlR0CkhXeusyoabQrCYsmL2at0ZCfmHVVHc69IeGSYxQ="
- secure: "rbzcB7761B5wFTsWRQb7lQYKIift2X6XlLBgDCd0iQII+PO8stP8px4l0rb/ELfVdqBv9s4aoAinXlLdp99hjhrrtYhbuk5imc4oNy0WpbMAEqccB47zUhy8efCAd9VGNLj7LX+TBef0DShpRVcFS62m5Oa0Z835vmKQhXsRf8ILiczEbfD7FKx3bLc2WtMOz8A1hZBrnXST5q9PAvvpmRuEC5SPxbAvmvlqbGGJ08c0HnfeeKRcvPwpMq/MuL8H0HU8EKpOE95QOV5Qe7su/YQYcQTyMFWINJ5SrWN5HlA2LUxeOE6ijDllxdX8JKKo3e8cZdbJFQhff1h0jl/m+xhMoAFbnxfyzUjGtFg6uAiuTtLw4FvTBFKXMME2LRUothEhMTh6KknZZDlqxFyoAXfWwCN7dsjY/nT7TJa/BZh+024e5T6wTm8srRGMpVgu+r0gjB9dG2s0lMIzOj6K77PIxJDx45m5Cw7HwZPZwe1HIWyOoct7Er3pNKGFvt9AXLpj85sp6FWzzJC4cGJ0SV5UhWI75NujpVO8jKn1gOuYBdSrP6y9ytJiQIvh4YDSpF19v7qsYeRIg64yH5x16OnY03aUxEQytASOrAng6BeUVanJPa6EwSJRfNl0xWcmaj+MkyQPqapnKuiiqnLlJeDTH8pnr/GUMpSDeFD6sQg="
- secure: "kZwnCS0XIDrg0Xre/LqyeDMoJMAMy00KW/rMtslTgFAknjylAUGOY8cCVlEFcOM+lmeCT3cDaiV+J+j5+rYk4LBKhU/n1h4G+Wj6KQdamTPysmSj6CDV0UlMh2vWIA/2qthB+wfOkA7fMl/YFNEc3wrmYKIQkPboSfEbBIyz3K8tZHSn0tKJ5iZlw269OSJdecRJ0lqPBXuY5IMw1wW9IroL2toH+bg55wFR9H0mYo46fBWfLjY1fHwKAtMI2epFxrIlhYejd1fRWyQY4qC8B29P5MgjuCya7OGVYl92M5Ldpl1p91+f2IIbpljxepBha7KKTSSONbOvGWuW9vGR2x86zN2yEBwhKEHZ/Yjoz9kuE9e910I/2YAH+dhwN+ZZKj457dPc4HMtXkXucPyppJoy9jwQIz4ZQbBOxrHgd+YV4gPm9x3P5Jjdic4d7LoYVzVRnGjbsxK1Y8H/oxzN54Jtoge6Oq8ZorDu+poALVhN/qtoXDtb+RGliAqIS9oUceeKMYDxmSPYToNBXAREGnEy5dO+vWUCrfmBOcUDRnUndfzWyx+xfOKiqo2lzz4qNihdwCad6nPzi+3bYUzHOFDrGHJYkQsF1Vj8TGbPXpuKcyRvQsgIn0JetHRifEOjArMM7xkvXYGLjdu7y1PrAj2kcoHPkkz7CpcQgBNvDkc="
- AIRBRAKE_ENVIRONMENT=development

before_install:
- chmod +x install-spark-2-2.sh
Expand Down
19 changes: 17 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

[![PyPI version](https://badge.fury.io/py/optimuspyspark.svg)](https://badge.fury.io/py/optimuspyspark) [![Build Status](https://travis-ci.org/ironmussa/Optimus.svg?branch=master)](https://travis-ci.org/ironmussa/Optimus) [![Documentation Status](https://readthedocs.org/projects/optimus-ironmussa/badge/?version=latest)](http://optimus-ironmussa.readthedocs.io/en/latest/?badge=latest)
[![built_by iron](https://img.shields.io/badge/built_by-iron-FF69A4.svg)](http://ironmussa.com) [![Updates](https://pyup.io/repos/github/ironmussa/Optimus/shield.svg)](https://pyup.io/repos/github/ironmussa/Optimus/)
[![GitHub release](https://img.shields.io/github/release/ironmussa/optimus.svg)](https://github.com/ironmussa/Optimus/) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/e01572e2af5640fcbcdd58e7408f3ea0)](https://www.codacy.com/app/favio.vazquezp/Optimus?utm_source=github.com&utm_medium=referral&utm_content=ironmussa/Optimus&utm_campaign=badger)
[![GitHub release](https://img.shields.io/github/release/ironmussa/optimus.svg)](https://github.com/ironmussa/Optimus/) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/e01572e2af5640fcbcdd58e7408f3ea0)](https://www.codacy.com/app/favio.vazquezp/Optimus?utm_source=github.com&utm_medium=referral&utm_content=ironmussa/Optimus&utm_campaign=badger) [![StackShare](https://img.shields.io/badge/tech-stack-0690fa.svg?style=flat)](https://stackshare.io/iron-mussa/devops)

[![Platforms](https://img.shields.io/badge/platform-Linux%20%7C%20Mac%20OS%20%7C%20Windows-blue.svg)](https://spark.apache.org/docs/2.2.0/#downloading) [![Dependency Status](https://gemnasium.com/badges/github.com/ironmussa/Optimus.svg)](https://gemnasium.com/github.com/ironmussa/Optimus) [![Quality Gate](https://sonarqube.com/api/badges/gate?key=ironmussa-optimus:optimus)](https://sonarqube.com/dashboard/index/ironmussa-optimus:optimus)
[![Platforms](https://img.shields.io/badge/platform-Linux%20%7C%20Mac%20OS%20%7C%20Windows-blue.svg)](https://spark.apache.org/docs/2.2.0/#downloading) [![Dependency Status](https://gemnasium.com/badges/github.com/ironmussa/Optimus.svg)](https://gemnasium.com/github.com/ironmussa/Optimus) [![Quality Gate](https://sonarqube.com/api/badges/gate?key=ironmussa-optimus:optimus)](https://sonarqube.com/dashboard/index/ironmussa-optimus:optimus)

[![Join the chat at https://gitter.im/optimuspyspark/Lobby](https://badges.gitter.im/optimuspyspark/Lobby.svg)](https://gitter.im/optimuspyspark/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion. It uses all the power of
Apache Spark (optimized via Catalyst) to do so. It implements several handy tools for data wrangling and munging that will
Expand Down Expand Up @@ -52,6 +53,20 @@ In your terminal just type:
pip install optimuspyspark
```

## Contributing to Optimus (based on Webpack)

Contributions go far beyond pull requests and commits. We are very happy to receive any kind of contributions
including:

* [Documentation](https://github.com/ironmussa/Optimus/blob/master/docs/index.rst) updates, enhancements, designs, or
bugfixes.
* Spelling or grammar fixes.
* README.md corrections or redesigns.
* Adding unit, or functional [tests](https://github.com/ironmussa/Optimus/tree/master/tests)
* Triaging GitHub issues -- especially determining whether an issue still persists or is reproducible.
* [Searching #optimusdata on twitter](https://twitter.com/search?q=optimusdata) and helping someone else who needs help.
* Helping others in our optimus [gitter channel](https://gitter.im/optimuspyspark/Lobby).

## Contributors:

- Project Manager: [Argenis León](https://github.com/argenisleon)
Expand Down
66 changes: 33 additions & 33 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -432,7 +432,7 @@ Lets say we want to plot a histogram of frecuencies for the ``product`` column.

.. code:: python
productDf = analyzer.get_data_frame().select("product") #or df.select("product")
productDf = analyzer.get_data_frame.select("product") #or df.select("product")
hist_dictPro = analyzer.get_categorical_hist(df_one_col=productDf, num_bars=10)
print(hist_dictPro)
Expand Down Expand Up @@ -462,7 +462,7 @@ Lets say we want to plot a histogram of frecuencies for the ``price`` column. We

.. code:: python
priceDf = analyzer.get_data_frame().select("price") #or df.select("price")
priceDf = analyzer.get_data_frame.select("price") #or df.select("price")
hist_dictPri = analyzer.get_numerical_hist(df_one_col=priceDf, num_bars=10)
print(hist_dictPri)
Expand Down Expand Up @@ -624,7 +624,7 @@ dataFrame:
# DataFrameTransformer Instanciation:
transformer = op.DataFrameTransformer(df)
transformer.get_data_frame().show()
transformer.show()
Output:

Expand Down Expand Up @@ -663,14 +663,14 @@ operation in whole dataframe.
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Triming string blank spaces:
transformer.trim_col("*")
# Printing trimmed dataFrame:
print('Trimmed dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -717,14 +717,14 @@ names.
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# drop column specified:
transformer.drop_col("country")
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:
Expand Down Expand Up @@ -772,14 +772,14 @@ argument in DataFrame.
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Keep columns specified by user:
transformer.keep_col(['city', 'population'])
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -832,14 +832,14 @@ in all columns of DataFrame that have same dataType of ``search`` and
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Replace values in columns specified by user:
transformer.replace_col(search='Tokyo', changeTo='Maracaibo', columns='city')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -893,15 +893,15 @@ python feature.
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Replace values in columns specified by user:
func = lambda pop: (pop > 6500000) & (pop <= 30000000)
transformer.delete_row(func(col('population')))
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -940,7 +940,7 @@ New dataFrame:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Delect rows where Tokyo isn't found in city
# column or France isn't found in country column:
Expand All @@ -949,7 +949,7 @@ New dataFrame:
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1006,7 +1006,7 @@ Here some examples:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
print (' Replacing a number if value in cell is greater than 5:')
Expand All @@ -1016,7 +1016,7 @@ Here some examples:
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1057,15 +1057,15 @@ New dataFrame:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Capital letters:
func = lambda cell: cell.upper()
transformer.set_col(['city'], func, 'string')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1151,14 +1151,14 @@ New DF:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Clear accents:
transformer.clear_accents(columns='*')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1207,14 +1207,14 @@ E.g:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Remove special characters:
transformer.remove_special_chars(columns=['city', 'country'])
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1259,15 +1259,15 @@ E.g:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
names = [('city', 'villes')]
# Changing name of columns:
transformer.rename_col(names)
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1354,14 +1354,14 @@ New DF:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Capital letters:
transformer.lookup('city', ['Caracas', 'Ccs'], 'Caracas')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1411,14 +1411,14 @@ E.g:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Capital letters:
transformer.move_col('city', 'country', position='after')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1510,14 +1510,14 @@ New DF:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Transformation:
transformer.explode_table('bill id', 'foods', 'Beer')
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down Expand Up @@ -1616,7 +1616,7 @@ New DF:
# Printing of original dataFrame:
print('Original dataFrame:')
transformer.get_data_frame().show()
transformer.show()
# Tranform string date format:
transformer.date_transform(columns="dates",
Expand All @@ -1625,7 +1625,7 @@ New DF:
# Printing new dataFrame:
print('New dataFrame:')
transformer.get_data_frame().show()
transformer.show()
Original dataFrame:

Expand Down
Loading

0 comments on commit 7a11ac1

Please sign in to comment.