New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add matrix factorization example back to pymc3 #3709

Merged

ColCarroll merged 2 commits into pymc-devs:master from zaxtax:pmf_notebook

Dec 6, 2019

Contributor

zaxtax commented Dec 5, 2019

Reintroducing the probabilistic matrix factorization example originally from @macks22 with the Movielens dataset instead of Jester. This could still use some polish, but the notebook works albeit slowly.


          Add initial draft of matrix factorization example

a284ef1

review-notebook-app bot commented Dec 5, 2019

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

Member

fonnesbeck commented Dec 5, 2019

This is great, and is already pretty polished, actually. Runs in about 30 min on my MBP.

codecov bot commented Dec 5, 2019

Codecov Report

Merging #3709 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3709   +/-   ##
=======================================
  Coverage   89.94%   89.94%           
=======================================
  Files         134      134           
  Lines       20430    20430           
=======================================
  Hits        18375    18375           
  Misses       2055     2055

ColCarroll reviewed

View reviewed changes

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

If you run this again, can you bump to PyMC3 v3.8?

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

could be just a single bar plot for this, right? the density plot is a little weird

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

I'd use

print(f"Users: {num_users}\nMovies:", {num_items}\nSparsity: {sparsity}")

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

I would replace this with a function:

def make_pmf_model(train, dim, alpha, std):

bounds = (1, 5)

data = train.copy()

n, m = data.shape

# Perform mean value imputation

nan_mask = np.isnan(data)

data[nan_mask] = data[~nan_mask].mean()

# Low precision reflects uncertainty; prevents overfitting.

# Set to the mean variance across users and items.

alpha_u = 1 / data.var(axis=1).mean()

alpha_v = 1 / data.var(axis=0).mean()

# Specify the model.

logging.info('building the PMF model')

with pm.Model() as pmf:

U = pm.MvNormal(

'U', mu=0, tau=alpha_u * np.eye(dim),

shape=(n, dim), testval=np.random.randn(n, dim) * std)

V = pm.MvNormal(

'V', mu=0, tau=alpha_v * np.eye(dim),

shape=(m, dim), testval=np.random.randn(m, dim) * std)

R = pm.Normal(

'R', mu=(U @ V.T)[~nan_mask], tau=alpha, observed=data[~nan_mask])

return pmf

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

I'd replace the self in these with a pmf instance.

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

i'd update this to use just pm.sample(draws=500, tune=500) (or similar!)

This should automatically use 1 chain per physical core and run diagnostics.

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

this self.std is defined as 1 / alpha in the constructor, and is maybe only used here

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

pmf = make_pmf_model(train, dim=10, alpha=2, std=0.05)

Reply via ReviewNB

docs/source/notebooks/probabilistic_matrix_factorization.ipynb

    
            @@ -0,0 +1,1489 @@
          
              {

Member

ColCarroll Dec 5, 2019

with pmf:

map_estimate = pm.find_MAP()

Reply via ReviewNB


          Cleaned up notebook

a6851ba

zaxtax changed the title ~~WIP: Add initial draft of matrix factorization example~~ Add matrix factorization example back to pymc3

ColCarroll merged commit 9e5177c into pymc-devs:master

Member

ColCarroll commented Dec 6, 2019

This is great @zaxtax -- thanks for getting this example back up and running! I can regen the docs so that it is up on the examples site, too.

zaxtax deleted the pmf_notebook branch

December 6, 2019 17:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment