-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance in Filtering Algorithms #633
Comments
Hi Subho,
This is an interesting one. It seems that in CASE 1, scala is doing something odd. When I bring this up in the debugger, the type returned by compute_sum in CASE 1 is a ((Boolean) => (Boolean)) => Element[Int]. I’m not sure what this is the case – I think it should just be an Element[Int], so not sure why it’s returning a function. In CASE 2, it does correctly return an Element[Int] (a Fold[Int] class, which is the same). To summarize, CASE 1 is doing nothing, but not sure why.
Now, for CASE 2, you’re doing the summation of a bunch of random values. The reason it’s slow is that this will probably result in some very large factors since there are many output combinations with 100 random numbers. However, since you’re not operating on the sum (just using it for output), it should still be feasible to run (albeit still slow, especially as time increases).
Brian
+ + + + + + + + + + + + + +
Brian Ruttenberg, PhD
Senior Scientist
Decision Management Systems Division
Government Services
Charles River Analytics Inc.
617.491.3474 x730
www.cra.com<http://www.cra.com/>
From: Subho Banerjee [mailto:notifications@github.com]
Sent: Wednesday, November 30, 2016 7:01 PM
To: p2t2/figaro <figaro@noreply.github.com>
Subject: [p2t2/figaro] Performance in Filtering Algorithms (#633)
I am trying to figure out how to efficiently implement a factoring algorithm in Figaro. Please have a look at the following code:
import com.cra.figaro.language._
import com.cra.figaro.library.collection._
import com.cra.figaro.algorithm.filtering._
object test {
lazy val init = () => {
val u = Universe.createNew()
val a = new FixedSizeArray[Int](100, (i) => Constant[Int](0)("val"+i, u))
compute_sum(a, u)
u
}
lazy val transition = (u: Universe) => {
val next = Universe.createNew()
val a = new FixedSizeArray[Int](100, (i) => Apply(u.getElementByReference[Int]("val"+i), Flip(0.5), (v: Int, f: Boolean) => {
if (f) v + 1 else v - 1
})("val"+i, next))
compute_sum(a, next)
next
}
def compute_sum(a: FixedSizeArray[Int], u:Universe) = {
// CASE 1
a.map{(i) => i>0}.count(_)
// CASE 2
a.map{(i) => i>0}.count{(b) => b}("ANSWER1", u)
}
def main(args: Array[String]): Unit = {
val inf = FactoredFrontier(init(), transition, 50)
inf.start
for (i <- 1 to 1000) {
inf.advanceTime()
println(i)
println(inf.computeCurrentExpectation[Int]("ANSWER1", (i:Int) => i))
}
inf.kill
}
}
When I use CASE 1 in compute_sum, the program finishes in a few seconds. However when I use CASE 2 it takes much much more time.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#633>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFJOJYFovq3B-rTNzcCASnQ6hiHxM8pJks5rDg5RgaJpZM4LA195>.
|
Thanks @bruttenberg . That definitely helps... Are there any "best-practices" guides available for getting the best performance out of a model. I am trying to model a system which computes factored frontier over 20k-30k variables (~100k size state space). Figaro doesn't seem to scale to this size of a model. As far as I can gather from the web, Figaro has the ability to recognize repeated (and independent) sub-models and can optimize the overall computation by memoizing the results from these sub-models. How do I ensure these optimizations are always successfully applied? |
Hi Subho,
We don’t have an explicit best practices guide, though that is actually a great idea. A lot of the best practices with all probabilistic programming languages and probabilistic models apply here, but I admit, those best practices are probably not written down anywhere. A lot of time it just sits in people’s heads from experience.
For your specific case, yes, this is quite a large model. The number of variables isn’t so much the problem as the state space, which will cause problems when you’re trying to multiply/sum factors. In general, doing things like abstracting states or compacting the state space will help. We do also know some tricks that can be done based on what we know about the underlying factors (e.g., some elements create sparse factors). But I don’t have any specific recommendations without knowing more about the problem.
As for your comment on memoization. Yes, we have some experimental algorithms than might be able to help in this case. It’s called structured factored inference (SFI), and it decomposes a model into smaller problems that are solved independently and rolled up. In general, we see improvements using SFI over non-SFI inference (without memoization). Memoization might help here, but keep in mind this is not true lifted inference (if you’re familiar with that), so its utility is really dependent on how the model is coded. We don’t have SFI working with Factored Frontier yet, but that is a pretty simple extension that can be done.
+ + + + + + + + + + + + + +
Brian Ruttenberg, PhD
Senior Scientist
Decision Management Systems Division
Government Services
Charles River Analytics Inc.
617.491.3474 x730
www.cra.com<http://www.cra.com/>
From: Subho Banerjee [mailto:notifications@github.com]
Sent: Thursday, December 1, 2016 10:55 AM
To: p2t2/figaro <figaro@noreply.github.com>
Cc: Brian Ruttenberg <bruttenberg@cra.com>; Comment <comment@noreply.github.com>
Subject: Re: [p2t2/figaro] Performance in Filtering Algorithms (#633)
Thanks Brian. That definitely helps...
Are there any "best-practices" guides available for getting the best performance out of a model. I am trying to model a system which computes factored frontier over 20k-30k variables (~100k size state space). Figaro doesn't seem to scale to this size of a model.
As far as I can gather from the web, Figaro has the ability to recognize repeated (and independent) sub-models and can optimize the overall computation to memoize the results from these sub-models. How do I ensure these optimizations are always successfully applied?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#633 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFJOJWcRIQeAP7z_VwaRbYjzaY9RjPofks5rDu3MgaJpZM4LA195>.
|
I am trying to figure out how to efficiently implement a factoring algorithm in Figaro. Please have a look at the following code:
When I use
CASE 1
incompute_sum
, the program finishes in a few seconds. However when I useCASE 2
it takes much much more time (~30 mins: 100x??). I dont quite understand why these two are different in performance.The text was updated successfully, but these errors were encountered: