Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

Add SLIM implementation #998

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

KimuraTian
Copy link
Contributor

slim demo

main classes
AbstractLinearRegression: Abstract class for elastic linear regression which is implemented by two different coordinate descent updates (CovarianceUpdateCoordDestlinearRegression and NaiveCoordDestLinearRegression)
SlimBuildContext: data context used by slim learning. (item rating map and item-item inner-products map)
SlimModelProvider: using ItemItemModel of knn package to find neighbors, and then training each item ItemItemModel.

@mdekstrand mdekstrand self-assigned this Mar 16, 2017
@mdekstrand
Copy link
Member

@KimuraTian Looks like there's a compile error that's causing the build to fail.

@KimuraTian
Copy link
Contributor Author

Thanks for mentioning it. I will check it again.

@codecov
Copy link

codecov bot commented Mar 17, 2017

Codecov Report

Merging #998 into master will increase coverage by 1.12%.
The diff coverage is 93.54%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #998      +/-   ##
============================================
+ Coverage     60.55%   61.67%   +1.12%     
- Complexity     2842     2940      +98     
============================================
  Files           421      432      +11     
  Lines         14582    14969     +387     
  Branches       1793     1841      +48     
============================================
+ Hits           8830     9232     +402     
+ Misses         4971     4944      -27     
- Partials        781      793      +12
Impacted Files Coverage Δ Complexity Δ
...im/src/main/java/org/lenskit/slim/NaiveUpdate.java 100% <100%> (ø) 5 <5> (?)
...src/main/java/org/lenskit/slim/SLIMItemScorer.java 100% <100%> (ø) 5 <5> (?)
...c/main/java/org/lenskit/slim/SLIMBuildContext.java 61.53% <61.53%> (ø) 7 <7> (?)
.../java/org/lenskit/slim/LinearRegressionHelper.java 66.66% <66.66%> (ø) 3 <3> (?)
...slim/src/main/java/org/lenskit/slim/SLIMModel.java 71.42% <71.42%> (ø) 2 <2> (?)
...in/java/org/lenskit/slim/SLIMUpdateParameters.java 88.88% <88.88%> (ø) 4 <4> (?)
.../main/java/org/lenskit/slim/SLIMModelProvider.java 91.66% <91.66%> (ø) 5 <5> (?)
...ain/java/org/lenskit/slim/SLIMScoringStrategy.java 95.65% <95.65%> (ø) 7 <7> (?)
...ava/org/lenskit/slim/SLIMBuildContextProvider.java 96.38% <96.38%> (ø) 16 <16> (?)
...a/org/lenskit/slim/FsSLIMBuildContextProvider.java 96.38% <96.38%> (ø) 16 <16> (?)
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e7c40c...71d4051. Read the comment docs.

Copy link
Member

@mdekstrand mdekstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll take me some time to fully grok the code, but here are some things that jump out quickly.

I like that the tests exercise so much of the code!

@DefaultImplementation(CovarianceUpdateCoordDestLinearRegression.class)
@Target({ElementType.TYPE, ElementType.PARAMETER})
@Retention(RetentionPolicy.RUNTIME)
public @interface LinearRegression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface doesn't seem to be adding anything - qualifiers would get applied to injection points, not to injectable interfaces, so it doesn't seem to be used.

*/
@LinearRegression
@DefaultImplementation(CovarianceUpdateCoordDestLinearRegression.class)
public abstract class AbstractLinearRegression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is - right now - SLIM-specific, what if we called it something SLIM-specific? We'll generalize later.

Maybe call it SLIMScoringStrategy?

* @param weights weight
* @return prediction vector
*/
public Long2DoubleMap predict(Map<Long, Long2DoubleMap> testData, Long2DoubleMap weights) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have Long2ObjectMap<Long2DoubleMap>.

Although really, given how much they're used here and elsewhere, we should probably have a Long2Long2DoubleMap or LLDSparseMatrix or something like that, that encapsulates the concept. Not that you have to build such a thing for this PR, it just seems like it'd be useful here.

}

@Test
public void testLinearRegression() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should have some assertions in it.

Rename class name including SLIMScoringStrategy etc.
Change all Map<Long,Long2DoubleMap> to Long2ObjectMap<Long2DoubleMap>
Write several comments

TODO:
rewrite some test cases
@KimuraTian
Copy link
Contributor Author

Thank you so much. So far it might be not ready for review, some test cases will be added soon.

@mdekstrand mdekstrand changed the title Feature/tian slim Add SLIM implementation Mar 17, 2017
Copy link
Member

@mdekstrand mdekstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job. I have a few cleanup comments mostly.

Why are we pulling in an item-item model? Is that just to reduce the search space?

Could we simplify by just having sets of co-rated items in the SLIM build context, rather than depending on a full item-item model? The full item-item model can be expensive train.

dotProdsOfXjXks.put(weightsIdk, dotProdOfXjXk);
}
innerProdsOfXs.put(j, dotProdsOfXjXks);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we de-duplicate loop bodies here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling in an item-item model is just to use its hooks for different similarity strategies. Now the codes for similarity strategies are copied into SLIMBuildContextProvider which will return a mapping from item ids to sets of corresponding neighbors. All the dependencies on KNN are removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, loop bodies are de-duplicated here.

* @param b
* @return
*/
public static Long2DoubleMap addVectors(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have this in Vectors? If not we should.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a similar version 'Vectors.combine'. But 'Vectors.combine' may only combine the elements in common positions between two vectors. It may not work correctly for the sparse vector addition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! We should perhaps have both versions in Vectors; maybe combine and combineAndLimit?

* @param b
* @return
*/
public static Long2DoubleMap multiply(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be in Vectors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may also not be used except tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it ebeMultiply to match the commons-math method name? But also, it should live in Vectors, even if it is only used in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I saw ebeMultiply in util.math.RowView. We can optimize it.

* @param matrix
* @return
*/
public static Long2ObjectMap<Long2DoubleMap> transposeMap(Long2ObjectMap<Long2DoubleMap> matrix) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should perhaps belong in a matrix utility class (to go with Vectors).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not used in SLIM except in test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not matter; we can have utilities in those classes that are only used for tests.

// * @param c
// * @return
// */
// public static Long2DoubleMap multiply(Long2DoubleMap a, double c) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's delete this commented-out code.

* @param value a value needs to be filtered out
* @return
*/
public static Long2DoubleMap filterValues(Long2DoubleMap a, double value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to introduce a customizable epsilon here? Scalars.isZero is pretty restrictive by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A customizable epsilon is added

* @param trainingDataMatrix observations matrix row: user ratings for different items, column: item ratings of different users
* @return a trained weight vector
*/
public abstract Long2DoubleMap fit(Long2DoubleMap labels, Long2ObjectMap<Long2DoubleMap> trainingDataMatrix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually use this one anywhere other than in tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It is just kept as an alternative and used as a double check in the tests in case something goes wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's consider dropping it, if it is easy enough to write the tests with the other method.



/**
* not used. wonder if abstract class is better than interface for different updates??
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's delete it.

@mdekstrand
Copy link
Member

Also, I don't see clear hooks for normalization right now - should we add those?

1. Add hooks for normalization
2. Add hooks for different similarity strategies to find neighbors.
Copied some codes in KNN.
3. Add mapping from item ids to sets of neighbors ids in
‘SLIMBuildcontext’ class
4. Add test cases for SLIMBuildContext class
Copy link
Member

@mdekstrand mdekstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running an evaluation to see how this performs on real data - want output from that before merging.

Also, we do need to modify lenskit-all to depend on lenskit-slim.

}
k++;
lossDiff = Math.abs(loss[0] - loss[1]); // compute difference of loss function between two round of coordinate descent updates
logger.info("{}th iteration and loss function reduced to {} and {} \n and weights is {}", k, loss[0], loss[1], weights);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this debug-level; it's too noisy at info.

@Qualifier
@Target({ElementType.METHOD, ElementType.PARAMETER})
@Retention(RetentionPolicy.RUNTIME)
public @interface Epsilon {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this MinItemWeight.

* @param matrix
* @return
*/
public static Long2ObjectMap<Long2DoubleMap> transposeMap(Long2ObjectMap<Long2DoubleMap> matrix) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not matter; we can have utilities in those classes that are only used for tests.

* @param b
* @return
*/
public static Long2DoubleMap addVectors(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! We should perhaps have both versions in Vectors; maybe combine and combineAndLimit?

* @param b
* @return
*/
public static Long2DoubleMap multiply(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it ebeMultiply to match the commons-math method name? But also, it should live in Vectors, even if it is only used in tests.


for (long key : aSet) {
double prod = a.get(key)*b.get(key);
productOfTwoVectors.put(key, prod);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if prod is zero, we don't need to put it in the vectors.

* @param epsilon tolerance
* @return a map with the values in {@code a} whose absolute value of differences from {@code value} is greater than {@code epsilon}.
*/
public static Long2DoubleMap filterValues(Long2DoubleMap a, double value, double epsilon) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also live in Vectors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We move it in this feature branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please.

@DefaultInteger(0)
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.PARAMETER, ElementType.FIELD, ElementType.METHOD})
public @interface MinCoRatedItems {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knn has a MinCommonUsers parameter - let's reuse that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

@Qualifier
@Target({ElementType.METHOD, ElementType.PARAMETER})
@Retention(RetentionPolicy.RUNTIME)
public @interface Intercept {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this IncludeIntercept?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove it as so far it doesn't support updating with intercept?

@mdekstrand mdekstrand added this to the 3.0 milestone Jun 22, 2017
renamed `multiply` to `ebeMultiply` and optimized it for 0 values
removed `MincoRateditems`
renamed `Intercept` to `IncludeIntercept`
* @param b The second vector
* @return A new vector of the addition of two vectors {@code a} + {@code b}
*/
public static Long2DoubleMap combineAndLimit(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should be called add, and the old combine should be combineAndLimit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative design would be to introduce a 'mode' enum with values ALL, ALL_A, ALL_B, and INTERSECT (or COMMON).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they were renamed.

Renamed original `combineAndLimit` to `add`
@KimuraTian
Copy link
Contributor Author

I am finishing up HPF implementation. Hopefully, we can get it for your review within this week.

* @param b The second vector
* @return A new vector of the addition of two vectors {@code a} + {@code b}
*/
public static Long2DoubleMap combineAndLimit(Long2DoubleMap a, Long2DoubleMap b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative design would be to introduce a 'mode' enum with values ALL, ALL_A, ALL_B, and INTERSECT (or COMMON).

while (iter.hasNext()) {
long item = iter.nextLong();
Long2DoubleAccumulator accum;
if (modelSize == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this model size truncation something that Karypis et al did in their SLIM paper? Or did they just use the lasso regression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that paper, there are two types of SLIM. One is using all the items except the label item itself as neighborhood items to train the regression model, the other one is using cosine similarity to find neighborhood items and then using those items to train the regression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paper calls it fsSLIM. So the model size is the number of neighbors. The paper used 100 and 150

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so that's what you were doing with the ItemItemModel in the old code. I did not understand this.

It looks like, to fix that, we now have a copy of item-item's model-build here.

I propose: we have two different context providers (with appropriate code sharing), one for fsSLIM and one for the other one. The fsSLIM can use an item-item model, as that will have the cosine similarities pre-computed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we now have a copy of item-item's model build here.

When SLIMModelSize is set to 0, it's SLIM otherwise fsSLIM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, ItemItemModel itself has ModelSize. By setting it to zero (or not), we get SLIM (or fsSLIM).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have 2 different implementations in this case, rather than depending on a parameter value. There are several ways we can design this; since the only difference between SLIM and fsSLIM seems to be how the 'candidate neighbors' are selected, we can have a CandidiateNeighborSelector interface that abstracts the difference.

It's often good design, when we identify some piece of logic that has 2 configurable variants, to abstract it into an interface & implementations (the strategy design pattern), so we can reconfigure it. The fsSLIM implementation can then depend on knn's ItemItemModel in order to identify candidate neighbors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! I am trying to do so. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a provider FsSLIMBuildContextProvider which depends on knn's ItemItemModel
The SLIMBuildContextProvider now get all the items as neighbors.

*
* @author <a href="http://www.grouplens.org">GroupLens Research</a>
*/
public class SLIMScorer extends AbstractItemScorer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called SLIMItemScorer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was renamed.

@Override
public ResultMap scoreWithDetails(long user, @Nonnull Collection<Long> items) {
logger.debug("scoring {} items for user {} with details", items.size(), user);
Long2DoubleMap ratings = Long2DoubleSortedArrayMap.create(rvDAO.userRatingVector(user));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to normalize the ratings and then denormalize the predictions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normalizer added.

* @param trainingDataMatrix observations matrix row: user ratings for different items, column: item ratings of different users
* @return a trained weight vector
*/
public abstract Long2DoubleMap fit(Long2DoubleMap labels, Long2ObjectMap<Long2DoubleMap> trainingDataMatrix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's consider dropping it, if it is easy enough to write the tests with the other method.

private void buildItemRatings(Long2ObjectMap<Long2DoubleSortedMap> itemVectors,
Long2ObjectMap<LongSortedSet> userItems) {
Long2ObjectMap<LongSet> userItemsTemp = new Long2ObjectOpenHashMap<>();
try (ObjectStream<IdBox<List<Rating>>> stream = dao.query(Rating.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need to use user-vector normalization for consistency. Also, we should use the RatingVectorPDAO instead of a data access object.

That means we will be processing ratings user-by-user instead of item-by-item. Once the final context is built, there will be no difference, but the loop here will need to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can fix it

@KimuraTian
Copy link
Contributor Author

So far the predictions can be within the range from 0 to 5, and seems reasonable using simulated ratings. It needs further to be tested by real data.

@mdekstrand mdekstrand modified the milestones: 3.0, 3.1 Feb 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants