Programming Exercises for Python

Data wrangling

Create a DataFrame named df with 6 nrows with the following columns:
- A: random floating point value
- B: randomly assigned categorical values from ["test", "train"]
- C: random integer values, constructed from an numpy.array
- D: random integer values, constructed from a Series
- E: monthly dates "2021-01-01", "2021-02-01", "2021-03-01" ...
Convert numeric columns into a numpy.matrix and compute the row sums.
Sort df by column C.
Filter df for entries for which B has value train and C has values greater than 0.
Change the value in the 4th column and 2nd row to 10.
Create a column F where half the values are NaN.
Deal with missing values in two different ways:
- remove entries with missing data
- fill missing values with 0
Convert column A into a cumulative sum.
Subtract column A from column B.
Plot the numeric columns as a line plot, ensuring that the plot has proper labels.
Compute the mean values of each column for groups train and test.
Convert the following DataFrame from a into b (long to wide). Additionally, convert from b into a (wide to long).

a = pd.DataFrame(
    {"value": [1, 2, 3, 4, 5, 6], "group": ["a", "a", "a", "b", "b", "b"]}
)

b = pd.DataFrame(
    {"a": [1, 2, 3], "b": [4, 5, 6]}
)

Supervised learning

Load the iris dataset by

import sklearn as sk
import sklearn.datasets
iris = sk.datasets.load_iris()

Visualize the data matrix.
Train a random forest classifier to predict the target values and report its performance using an appropriate evaluation metric.
Explain how key parameters of the random forest classifier would influence its peformance.

Text mining

Using Biopython, collect medline abstracts on "medulloblastoma" published in 2012. Save the data to disk as a CSV table.
Import the CSV table and build a SQLite database.
Obtain the PMID and title of publications for authors with the surname "Shih" from the database.

Neural network

Implement a full connected feedforward network from scratch using only the numpy library with the following layers: one input, two hidden, and one output. Neurons in the first hidden layer should use the sigmoid transfer function; those in the second hidden layer should use a ReLU transfer function. The network should be trained using backpropagation of errors.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Programming Exercises for Python

Data wrangling

Supervised learning

Text mining

Neural network

About

Uh oh!

Releases

Packages

djhshih/exercise-python

Folders and files

Latest commit

History

Repository files navigation

Programming Exercises for Python

Data wrangling

Supervised learning

Text mining

Neural network

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages