Skip to content

Update Featuretools Branch Name #1038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 2, 2020
Merged

Update Featuretools Branch Name #1038

merged 10 commits into from
Jul 2, 2020

Conversation

gsheni
Copy link
Contributor

@gsheni gsheni commented Jun 29, 2020

Pull Request Description

  • Change default branch name from master to main

@gsheni gsheni requested a review from rwedge June 29, 2020 18:23
@gsheni gsheni self-assigned this Jun 29, 2020
release.md Outdated
@@ -28,12 +28,12 @@ Branches on the conda-forge featuretools repo are automatically built and the pa
```bash
git remote add upstream https://github.com/conda-forge/featuretools-feedstock.git
```
4. If you made the fork previously and its master branch is missing commits, update it with any changes from upstream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if we will be able to rename this branch since it is part of conda-forge, worth looking into

Copy link
Contributor Author

@gsheni gsheni Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that conda-forge defaults to master, and all the packages use master.

@codecov
Copy link

codecov bot commented Jun 30, 2020

Codecov Report

Merging #1038 into main will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1038      +/-   ##
==========================================
- Coverage   98.35%   98.33%   -0.02%     
==========================================
  Files         126      126              
  Lines       13052    13052              
==========================================
- Hits        12837    12835       -2     
- Misses        215      217       +2     
Impacted Files Coverage Δ
featuretools/synthesis/deep_feature_synthesis.py 97.34% <0.00%> (-0.49%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f779e3c...20b2e4e. Read the comment docs.

@@ -88,7 +88,7 @@ Featuretools contains many [different types of built-in primitives](https://prim
## Demos
**Predict Next Purchase**

[Repository](https://github.com/Featuretools/predict_next_purchase/) | [Notebook](https://github.com/Featuretools/predict_next_purchase/blob/master/Tutorial.ipynb)
[Repository](https://github.com/Featuretools/predict_next_purchase/) | [Notebook](https://github.com/Featuretools/predict_next_purchase/blob/main/Tutorial.ipynb)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -113,9 +113,9 @@ When an entire dataset is not required to calculate the features for a given set

An example of this approach can be seen in the `Predict Next Purchase demo notebook <https://github.com/featuretools/predict_next_purchase>`_. In this example, we partition data by customer and only load a fixed number of customers into memory at any given time. We implement this easily using `Dask <https://dask.pydata.org/>`_, which could also be used to scale the computation to a cluster of computers. A framework like `Spark <https://spark.apache.org/>`_ could be used similarly.

An additional example of partitioning data to distribute on multiple cores or a cluster using Dask can be seen in the `Featuretools on Dask notebook <https://github.com/Featuretools/Automated-Manual-Comparison/blob/master/Loan%20Repayment/notebooks/Featuretools%20on%20Dask.ipynb>`_. This approach is detailed in the `Parallelizing Feature Engineering with Dask article <https://medium.com/feature-labs-engineering/scaling-featuretools-with-dask-ce46f9774c7d>`_ on the Feature Labs engineering blog. Dask allows for simple scaling to multiple cores on a single computer or multiple machines on a cluster.
An additional example of partitioning data to distribute on multiple cores or a cluster using Dask can be seen in the `Featuretools on Dask notebook <https://github.com/Featuretools/Automated-Manual-Comparison/blob/main/Loan%20Repayment/notebooks/Featuretools%20on%20Dask.ipynb>`_. This approach is detailed in the `Parallelizing Feature Engineering with Dask article <https://medium.com/feature-labs-engineering/scaling-featuretools-with-dask-ce46f9774c7d>`_ on the Feature Labs engineering blog. Dask allows for simple scaling to multiple cores on a single computer or multiple machines on a cluster.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


For a similar partition and distribute implementation using Apache Spark with PySpark, refer to the `Feature Engineering on Spark notebook <https://github.com/Featuretools/predicting-customer-churn/blob/master/churn/4.%20Feature%20Engineering%20on%20Spark.ipynb>`_. This implementation shows how to carry out feature engineering on a cluster of EC2 instances using Spark as the distributed framework. A write-up of this approach is described in the `Featuretools on Spark article <https://blog.featurelabs.com/featuretools-on-spark-2/>`_ on the Feature Labs engineering blog.
For a similar partition and distribute implementation using Apache Spark with PySpark, refer to the `Feature Engineering on Spark notebook <https://github.com/Featuretools/predicting-customer-churn/blob/main/churn/4.%20Feature%20Engineering%20on%20Spark.ipynb>`_. This implementation shows how to carry out feature engineering on a cluster of EC2 instances using Spark as the distributed framework. A write-up of this approach is described in the `Featuretools on Spark article <https://blog.featurelabs.com/featuretools-on-spark-2/>`_ on the Feature Labs engineering blog.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- name: PyPI Upload
uses: FeatureLabs/gh-action-pypi-upload@master
uses: FeatureLabs/gh-action-pypi-upload@main
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should switch to using tagged releases of this action so that the branch name isn't even a factor

@rwedge rwedge changed the base branch from master to main July 1, 2020 19:50
@gsheni gsheni merged commit 45998e8 into main Jul 2, 2020
@gsheni gsheni deleted the update_ft_branch branch July 2, 2020 18:24
@rwedge rwedge mentioned this pull request Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants