-
Notifications
You must be signed in to change notification settings - Fork 326
Add SLIM implementation #998
base: master
Are you sure you want to change the base?
Conversation
@KimuraTian Looks like there's a compile error that's causing the build to fail. |
Thanks for mentioning it. I will check it again. |
Codecov Report
@@ Coverage Diff @@
## master #998 +/- ##
============================================
+ Coverage 60.55% 61.67% +1.12%
- Complexity 2842 2940 +98
============================================
Files 421 432 +11
Lines 14582 14969 +387
Branches 1793 1841 +48
============================================
+ Hits 8830 9232 +402
+ Misses 4971 4944 -27
- Partials 781 793 +12
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll take me some time to fully grok the code, but here are some things that jump out quickly.
I like that the tests exercise so much of the code!
@DefaultImplementation(CovarianceUpdateCoordDestLinearRegression.class) | ||
@Target({ElementType.TYPE, ElementType.PARAMETER}) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
public @interface LinearRegression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface doesn't seem to be adding anything - qualifiers would get applied to injection points, not to injectable interfaces, so it doesn't seem to be used.
*/ | ||
@LinearRegression | ||
@DefaultImplementation(CovarianceUpdateCoordDestLinearRegression.class) | ||
public abstract class AbstractLinearRegression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is - right now - SLIM-specific, what if we called it something SLIM-specific? We'll generalize later.
Maybe call it SLIMScoringStrategy
?
* @param weights weight | ||
* @return prediction vector | ||
*/ | ||
public Long2DoubleMap predict(Map<Long, Long2DoubleMap> testData, Long2DoubleMap weights) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have Long2ObjectMap<Long2DoubleMap>
.
Although really, given how much they're used here and elsewhere, we should probably have a Long2Long2DoubleMap
or LLDSparseMatrix
or something like that, that encapsulates the concept. Not that you have to build such a thing for this PR, it just seems like it'd be useful here.
} | ||
|
||
@Test | ||
public void testLinearRegression() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should have some assertions in it.
Rename class name including SLIMScoringStrategy etc. Change all Map<Long,Long2DoubleMap> to Long2ObjectMap<Long2DoubleMap> Write several comments TODO: rewrite some test cases
Thank you so much. So far it might be not ready for review, some test cases will be added soon. |
Hopefully, this version is ready for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good job. I have a few cleanup comments mostly.
Why are we pulling in an item-item model? Is that just to reduce the search space?
Could we simplify by just having sets of co-rated items in the SLIM build context, rather than depending on a full item-item model? The full item-item model can be expensive train.
dotProdsOfXjXks.put(weightsIdk, dotProdOfXjXk); | ||
} | ||
innerProdsOfXs.put(j, dotProdsOfXjXks); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we de-duplicate loop bodies here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling in an item-item model is just to use its hooks for different similarity strategies. Now the codes for similarity strategies are copied into SLIMBuildContextProvider which will return a mapping from item ids to sets of corresponding neighbors. All the dependencies on KNN are removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, loop bodies are de-duplicated here.
* @param b | ||
* @return | ||
*/ | ||
public static Long2DoubleMap addVectors(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we have this in Vectors
? If not we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a similar version 'Vectors.combine'. But 'Vectors.combine' may only combine the elements in common positions between two vectors. It may not work correctly for the sparse vector addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! We should perhaps have both versions in Vectors
; maybe combine
and combineAndLimit
?
* @param b | ||
* @return | ||
*/ | ||
public static Long2DoubleMap multiply(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be in Vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may also not be used except tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it ebeMultiply
to match the commons-math
method name? But also, it should live in Vectors
, even if it is only used in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw ebeMultiply
in util.math.RowView
. We can optimize it.
* @param matrix | ||
* @return | ||
*/ | ||
public static Long2ObjectMap<Long2DoubleMap> transposeMap(Long2ObjectMap<Long2DoubleMap> matrix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should perhaps belong in a matrix utility class (to go with Vectors
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is not used in SLIM except in test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does not matter; we can have utilities in those classes that are only used for tests.
// * @param c | ||
// * @return | ||
// */ | ||
// public static Long2DoubleMap multiply(Long2DoubleMap a, double c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's delete this commented-out code.
* @param value a value needs to be filtered out | ||
* @return | ||
*/ | ||
public static Long2DoubleMap filterValues(Long2DoubleMap a, double value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to introduce a customizable epsilon here? Scalars.isZero
is pretty restrictive by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A customizable epsilon is added
* @param trainingDataMatrix observations matrix row: user ratings for different items, column: item ratings of different users | ||
* @return a trained weight vector | ||
*/ | ||
public abstract Long2DoubleMap fit(Long2DoubleMap labels, Long2ObjectMap<Long2DoubleMap> trainingDataMatrix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually use this one anywhere other than in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It is just kept as an alternative and used as a double check in the tests in case something goes wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's consider dropping it, if it is easy enough to write the tests with the other method.
|
||
|
||
/** | ||
* not used. wonder if abstract class is better than interface for different updates?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's delete it.
Also, I don't see clear hooks for normalization right now - should we add those? |
1. Add hooks for normalization 2. Add hooks for different similarity strategies to find neighbors. Copied some codes in KNN. 3. Add mapping from item ids to sets of neighbors ids in ‘SLIMBuildcontext’ class 4. Add test cases for SLIMBuildContext class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am running an evaluation to see how this performs on real data - want output from that before merging.
Also, we do need to modify lenskit-all
to depend on lenskit-slim
.
} | ||
k++; | ||
lossDiff = Math.abs(loss[0] - loss[1]); // compute difference of loss function between two round of coordinate descent updates | ||
logger.info("{}th iteration and loss function reduced to {} and {} \n and weights is {}", k, loss[0], loss[1], weights); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this debug
-level; it's too noisy at info
.
@Qualifier | ||
@Target({ElementType.METHOD, ElementType.PARAMETER}) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
public @interface Epsilon { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this MinItemWeight
.
* @param matrix | ||
* @return | ||
*/ | ||
public static Long2ObjectMap<Long2DoubleMap> transposeMap(Long2ObjectMap<Long2DoubleMap> matrix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does not matter; we can have utilities in those classes that are only used for tests.
* @param b | ||
* @return | ||
*/ | ||
public static Long2DoubleMap addVectors(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! We should perhaps have both versions in Vectors
; maybe combine
and combineAndLimit
?
* @param b | ||
* @return | ||
*/ | ||
public static Long2DoubleMap multiply(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it ebeMultiply
to match the commons-math
method name? But also, it should live in Vectors
, even if it is only used in tests.
|
||
for (long key : aSet) { | ||
double prod = a.get(key)*b.get(key); | ||
productOfTwoVectors.put(key, prod); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if prod
is zero, we don't need to put it in the vectors.
* @param epsilon tolerance | ||
* @return a map with the values in {@code a} whose absolute value of differences from {@code value} is greater than {@code epsilon}. | ||
*/ | ||
public static Long2DoubleMap filterValues(Long2DoubleMap a, double value, double epsilon) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also live in Vectors
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We move it in this feature branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please.
@DefaultInteger(0) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
@Target({ElementType.PARAMETER, ElementType.FIELD, ElementType.METHOD}) | ||
public @interface MinCoRatedItems { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
knn
has a MinCommonUsers
parameter - let's reuse that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
@Qualifier | ||
@Target({ElementType.METHOD, ElementType.PARAMETER}) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
public @interface Intercept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this IncludeIntercept
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to remove it as so far it doesn't support updating with intercept?
renamed `multiply` to `ebeMultiply` and optimized it for 0 values removed `MincoRateditems` renamed `Intercept` to `IncludeIntercept`
* @param b The second vector | ||
* @return A new vector of the addition of two vectors {@code a} + {@code b} | ||
*/ | ||
public static Long2DoubleMap combineAndLimit(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one should be called add
, and the old combine
should be combineAndLimit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative design would be to introduce a 'mode' enum with values ALL
, ALL_A
, ALL_B
, and INTERSECT
(or COMMON
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they were renamed.
Renamed original `combineAndLimit` to `add`
I am finishing up HPF implementation. Hopefully, we can get it for your review within this week. |
* @param b The second vector | ||
* @return A new vector of the addition of two vectors {@code a} + {@code b} | ||
*/ | ||
public static Long2DoubleMap combineAndLimit(Long2DoubleMap a, Long2DoubleMap b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative design would be to introduce a 'mode' enum with values ALL
, ALL_A
, ALL_B
, and INTERSECT
(or COMMON
).
while (iter.hasNext()) { | ||
long item = iter.nextLong(); | ||
Long2DoubleAccumulator accum; | ||
if (modelSize == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this model size truncation something that Karypis et al did in their SLIM paper? Or did they just use the lasso regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that paper, there are two types of SLIM. One is using all the items except the label item itself as neighborhood items to train the regression model, the other one is using cosine similarity to find neighborhood items and then using those items to train the regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paper calls it fsSLIM. So the model size is the number of neighbors. The paper used 100 and 150
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so that's what you were doing with the ItemItemModel
in the old code. I did not understand this.
It looks like, to fix that, we now have a copy of item-item's model-build here.
I propose: we have two different context providers (with appropriate code sharing), one for fsSLIM and one for the other one. The fsSLIM can use an item-item model, as that will have the cosine similarities pre-computed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we now have a copy of item-item's model build here.
When SLIMModelSize
is set to 0, it's SLIM otherwise fsSLIM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, ItemItemModel
itself has ModelSize
. By setting it to zero (or not), we get SLIM (or fsSLIM).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have 2 different implementations in this case, rather than depending on a parameter value. There are several ways we can design this; since the only difference between SLIM and fsSLIM seems to be how the 'candidate neighbors' are selected, we can have a CandidiateNeighborSelector
interface that abstracts the difference.
It's often good design, when we identify some piece of logic that has 2 configurable variants, to abstract it into an interface & implementations (the strategy design pattern), so we can reconfigure it. The fsSLIM implementation can then depend on knn's ItemItemModel
in order to identify candidate neighbors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much! I am trying to do so. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a provider FsSLIMBuildContextProvider
which depends on knn's ItemItemModel
The SLIMBuildContextProvider
now get all the items as neighbors.
* | ||
* @author <a href="http://www.grouplens.org">GroupLens Research</a> | ||
*/ | ||
public class SLIMScorer extends AbstractItemScorer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called SLIMItemScorer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was renamed.
@Override | ||
public ResultMap scoreWithDetails(long user, @Nonnull Collection<Long> items) { | ||
logger.debug("scoring {} items for user {} with details", items.size(), user); | ||
Long2DoubleMap ratings = Long2DoubleSortedArrayMap.create(rvDAO.userRatingVector(user)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to normalize the ratings and then denormalize the predictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalizer added.
* @param trainingDataMatrix observations matrix row: user ratings for different items, column: item ratings of different users | ||
* @return a trained weight vector | ||
*/ | ||
public abstract Long2DoubleMap fit(Long2DoubleMap labels, Long2ObjectMap<Long2DoubleMap> trainingDataMatrix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's consider dropping it, if it is easy enough to write the tests with the other method.
private void buildItemRatings(Long2ObjectMap<Long2DoubleSortedMap> itemVectors, | ||
Long2ObjectMap<LongSortedSet> userItems) { | ||
Long2ObjectMap<LongSet> userItemsTemp = new Long2ObjectOpenHashMap<>(); | ||
try (ObjectStream<IdBox<List<Rating>>> stream = dao.query(Rating.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to use user-vector normalization for consistency. Also, we should use the RatingVectorPDAO
instead of a data access object.
That means we will be processing ratings user-by-user instead of item-by-item. Once the final context is built, there will be no difference, but the loop here will need to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can fix it
Renamed `SLIMScorer` to `SLIMItemScorer` Added UserVectorNormalizer into building context
Add test cases for context
So far the predictions can be within the range from 0 to 5, and seems reasonable using simulated ratings. It needs further to be tested by real data. |
# Conflicts: # lenskit-core/src/main/java/org/lenskit/util/math/Vectors.java
slim demo
main classes
AbstractLinearRegression: Abstract class for elastic linear regression which is implemented by two different coordinate descent updates (CovarianceUpdateCoordDestlinearRegression and NaiveCoordDestLinearRegression)
SlimBuildContext: data context used by slim learning. (item rating map and item-item inner-products map)
SlimModelProvider: using ItemItemModel of knn package to find neighbors, and then training each item ItemItemModel.